The Megalomaniac Bore: std::string

Showing posts with label std::string. Show all posts

Wednesday, 31 January 2018

C++ boost::replace_all Slowed Me

I've just had myself into a complete frazzle, code yesterday was fast, rendering 120fps, the same code today was struggling to pass 11fps. And it had me pulling my hair out, of which I have little left to spare.

The problem?

I had been processing strings for specific patterns and only replacing them, so a variable in my engine might have the value "SCORE" and it'd replace the players score value directly into the std::string instance for display.

I however decided I wanted to compound all these reserved words to allow any value to contain any replaced value and also contain formatting, so I could so something like "You have scored %SCORE% today" and it'd just place the number in place.

I turned to boost for this, boost::replace_all, to be specific, and I had about 45 pattern matches which would try to replace any instance of the string in place.

However, this function does not look a head if the predicate is present in the source string, it's in fact very slow.

So code:

const std::string l_Pattern("%SCORE%");
std::string l_Source("You have scored %SCORE% today");
.
.
.
boost::replace_all(
l_Source,
l_Pattern,
Player::Instance()::ScoreAsString());

Would result in very slow performance, my solution is not perform the replace... search for the pattern predicate first:

if ( l_Scource.find(l_Pattern) != std::string::npos )
{
boost::replace_all(
l_Source,
l_Pattern,
Player::Instance()::ScoreAsString());
}

This latter code runs so much more quickly, I'm far in a way back a head of the speed curve, but I have this lovely dynamic placing of the variables into my rendering controls, and any control can receive any value just by my tweaking the loaded display script, so neat....

Anyway, I hope that helps. If you want to help me, pop over to my YouTube channel and hit that subscribe button.

Thursday, 30 November 2017

C++ : Trust the STL

One sore lesson to teach some developers is when to trust the compiler, once you've gotten that across you have to start teaching folks to stop re-inventing the wheel.

If someone has already implemented a file handler, or a serial port abstraction, or a wrapper for some obscure feature, you need to evaluate that offering...

To evaluate whether a library is worth using, firstly see if it works, then see how many folks actually use it, the more that use it then the more likely bugs will be flushed out and the whole thing has been tested.

Leveraging this kind of mature code within your projects assists in bootstrapping the startup phase of new projects.

Boost is a note worthy example of what I'm talking about here, many software shops (at least the ones I know) resist using open-source or third party libraries, they prefer to stick to in-house developed niche implementations until the very last moment, this of course slows development and completely symies innovation.

Boost however is one step further than the problem I'm going to tackle today... The Standard Template Library...

The STL is often commented upon negatively, this is despite it being a hugely available resource, vastly and deeply tested throughout and constantly incorporating new innovations. Whole books have been written on the topic, and yet one can still find projects and individuals resisting using the STL.

STL nay-sayers will quote "no need for an STL requirement", "uses less memory than an STL implementation" or "faster than the STL"...

The problem with this attitude is, are such attitudes going to sufficiently tackle testing of their bespoke solution, is that bespoke solution going to be as robust or as easily maintained as something using the STL?

Probably not, and this is a hard one for die hard "purist" developers to swallow, we want to write all our own code, we want to be gods in our domain, the trouble is for the vast number of us, god has already been there and he wrote a decent enough library to do the task we need doing... So leverage this!

I came across one such niche item the other day, with an algorithm to see if a string starts with...

They hadn't used boost, or the STL, to do the searching, yet perversely had used an std::string... Their code, looked a little like this:

const bool StartsWith(

const std::string& p_Text,

const std::string& p_Pattern)

{

bool l_result(true);

if ( p_Text.length() >= p_Pattern.length() )

{

for (unsigned int i(0);

i < p_Pattern.length();

++i)

{

if ( p_Text[i] != p_Pattern[i] )

{

l_result = false;

break;

}

else

{

l_result = false;

}

return l_result;

}

It is fairly logical code, they're looking at the length of the presented parameters, to avoid looping when not required, then they only loop from the start and only return a fail when the character is a miss-match, looking at this with programming eyes from 1996, I'd say this is fine.

Looking with eyes well aware of the STL, I cringe a little, and replaced this whole function like this...

const bool StartsWith(

const std::string& p_Text,

const std::string& p_Pattern)

{

return (p_Text.find(p_Pattern) == 0);

}

One line, of very much more maintainable, vastly more readable and easy to comprehend code...

The developer of the original however was not happy... "you're wasting resources, this will find any instance and tell you the input".... he's right it will, but the STL will still be faster than his code.

I demonstrated this by plugging both into CompilerExplorer... He still refused to listen.

Therefore, I've written this little helper project, to run the two functions side by side, threaded three tests, looking for the match, a long match and a negative match at the start of the string (Code on Github).

The results of this are interesting, you see the project itself favours cases where it's highly likely the string being searched for is present and therefore we don't need to worry too much about the odd test not finding a match taking longer... This is exactly the behaviour seen in the STL based find example.

The Short search time, for the same data, on the same processor went from 28358 microseconds to just 5234... That's about 81% faster. The longer search is more stark, falling from 185966 microseconds to just 6884, just over 96% faster!

The rub is the negative case took longer, rising from 19765 in the hand-crafted search to 25695, just over 30% slower. Some of this increase can be explained perhaps by the hand-crafted version using the lengths to quickly skip too short an input, otherwise it is simply that the STL find has to iterate over the whole string when no match is found. A hybrid to not perform the find at all, when there is insufficient data maybe in order; however this may add to our maintenance burden and lower code clarity, swings and roundabouts.

However, clearly in the case of this project, dismissing the STL resulted in slower code, we have a system propensity for matches, they're quite short, and all target platforms have the STL built in, use it.

Never be affraid to ask questions of what you're working with, ever.

Monday, 20 January 2014

C++ Integer to std::string conversion speed

There is often a lot of discussion about the most efficient way to convert things in C++, personally I like the boost::lexical_cast, I find it gives clear and readable code; very important in the systems I write, especially for maintenance and up keep.

However, many assume it to be slow, indeed most authors on the topic turn almost immediately to C for the fastest way to convert integers to strings, and unfortunately I find the same is true, even with std::string and std::ostringstream features the old sprintf tends to be faster.

But, crippling many users of std::string is their lack of understanding of that standard library staple class, so here is my little investigation into the speed of conversion, using a mix of C and standard C++, to give you a good idea of how fast things can be, and how to use your standard class and its memory to best effect.

The first trick you will see in this is the use of the std::string as a memory buffer, you can do this by declaring your standard string, then resizing it...

std::string MyName;

MyName.resize(12);

The memory location now at &MyName[0] points to 12 empty characters for you to use just as you would a char* or &char[], useful to stop using uncontrolled char* buffers left right and center, and perhaps the least used "tip" I can give when using std::string.

So, what conversions do we have?...

class Conversions

{

public:

static const std::string IntToString(const int& p_Input);

static const std::string IntToString2(const int& p_Input);

static const std::string IntToString3(const int& p_Input);

static const char* IntToString4(const int& p_Input);

static const void TestConversions (const int& p_Cycles);

};

The first three are going to be using whatever code to always give a pre-allocated std::string, the fourth conversion is going to return a raw char*, so the programmer has to delete the result etc, or rick a memory leak.

I still want that fourth result to be a std::string however, and I'm not going to worry about the leaking memory, so though the function performs a new[] I will not perform a delete[], but simply will assign the char* to a new std::string upon return. Making the timings taken fairer.

So, lets look at the program code for each of the functions, and the timing in the test function. I'm going to use boost::posix_time for the timings here:

const std::string Conversions::IntToString(const int& p_Input)

{

std::string l_buffer;

l_buffer.resize(33);

sprintf(&l_buffer[0], "%d", p_Input);

return l_buffer;

}

Above we can see the first function, using our string resizing tip, we resize the string as a buffer to take up to 33 characters (the maximum size for a 32bit integer is 33 characters) and then we use the old fashioned sprintf from the cstdlib header. Many claim this to be the fastest, the defacto conversion, even more so than using itoa.

const std::string Conversions::IntToString2(const int& p_Input)

{

std::ostringstream l_oss;

l_oss << p_Input;

return l_oss.str();

}

Next is the standard library way of working, we create a new ostringstream and stream the integer into it, then return the std::string from the stream. I believe this to be pretty slow, my mind tells me that the creation of the stream and then the extraction of the result is going to be slow; but we shall see in a moment.

const std::string Conversions::IntToString3(const int& p_Input)

{

char l_buffer[33];

sprintf(l_buffer, "%d", p_Input);

return std::string (l_buffer);

}

Next we have another use of sprintf, however, this time we're not resizing a string natively, we're creating a character string and then casting it back as a result. I think this may be the fastest, but again we shall see.

const char* Conversions::IntToString4(const int& p_Input)

{

char* l_buffer = new char[33];

sprintf(l_buffer, "%d", p_Input);

return l_buffer;

}

Finally, very similar to the third test, this conversion creates a new character array pointer as the buffer, and then uses sprintf. This is going to leak memory if we don't delete[], but I'm ignoring that for now and just testing the speed.

Now onto the conversion test function, the basic layout is, take the start time before the conversion, call the conversion a lot of times assigning the result to a std::string locally, and then take the time after and output the difference in milliseconds...

Now, the speed of this code in C++ is going to be fast in all cases, so we need enough sample calls to get a reading... I'm settling on 30 million calls, 30000000.

const void Conversions::TestConversions (const int& p_Cycles)

{

// Test 1

std::cout << "Test 1...";

boost::posix_time::ptime l_end;

boost::posix_time::ptime l_start (boost::posix_time::second_clock::local_time());

std::string l_result;

for (int i = 0; i < p_Cycles; ++i)

{

l_result = IntToString(i);

}

l_end = boost::posix_time::second_clock::local_time();

boost::posix_time::time_duration l_diff = l_end - l_start;

std::cout << l_diff.total_milliseconds() << std::endl;

// Test 2

std::cout << "Test 2...";

l_start = boost::posix_time::second_clock::local_time();

for (int i = 0; i < p_Cycles; ++i)

{

l_result = IntToString2(i);

}

l_end = boost::posix_time::second_clock::local_time();

l_diff = l_end - l_start;

std::cout << l_diff.total_milliseconds() << std::endl;

// Test 3

std::cout << "Test 3...";

l_start = boost::posix_time::second_clock::local_time();

for (int i = 0; i < p_Cycles; ++i)

{

l_result = IntToString3(i);

}

l_end = boost::posix_time::second_clock::local_time();

l_diff = l_end - l_start;

std::cout << l_diff.total_milliseconds() << std::endl;

// Test 4

std::cout << "Test 4...";

l_start = boost::posix_time::second_clock::local_time();

for (int i = 0; i < p_Cycles; ++i)

{

l_result = IntToString4(i);

}

l_end = boost::posix_time::second_clock::local_time();

l_diff = l_end - l_start;

std::cout << l_diff.total_milliseconds() << std::endl;

}

Lets see what our output is:

Test1...6000

Test2...17000

Test3...5000

Test4...6000

Immediately we can see my hunch about using the string stream is correct, its very much slower, more than twice as slow.

Surprisingly, at least to most readers - one hopes - we can see that the string::resize and use of sprintf is very close to the other uses of sprintf.

Since sprintf should be taking a constant amount of time what we've timed in tests 1, 3 and 4 is the speed of our memory management, how quickly has the function made the result available.

Some readers maybe screaming at me to use itoa, and one could do that with a char buffer, or even a resized string, thus:

std::string buffer;

buffer.resize(33);

itoa (&buffer[0], p_Input, 10);

However, itoa is not a standard function and some compilers don't supply it, therefore you must generally always use the lowest common denominator and sprintf is just that.

There is also one last avenue, the lexical cast...

// Test 5

std::cout << "Test 5...";

l_start = boost::posix_time::second_clock::local_time();

for (int i = 0; i < p_Cycles; ++i)

{

l_result = boost::lexical_cast<int>(i);

}

l_end = boost::posix_time::second_clock::local_time();

l_diff = l_end - l_start;

std::cout << l_diff.total_milliseconds() << std::endl;

Now, this approach should take into account the "bad_lexical_cast" exception, but exception handling is slow so we're ignoring that at this juncture. And assuming we have a known data source (int i) which we have in a valid range.

Our test results how are similar for the original calls...

Test 1...6000

Test 2...17000

Test 3...5000

Test 4...5850

The new fifth test...

Test 5...1000

This is very much quicker than any of the other solutions proposed...

So, there you go...

#include <boost/lexical_cast.hpp>

#include <boost/date_time/local_time/local_time.hpp>

#include <string>

#include <iostream>

int main ()

{

// Test 5

std::cout << "Test 5...";

boost::posix_time::ptime l_start =

boost::posix_time::second_clock::local_time();

for (int i = 0; i < p_Cycles; ++i)

{

l_result = boost::lexical_cast<int>(i);

}

boost::posix_time::ptime l_end =

boost::posix_time::second_clock::local_time();

boost::posix_time::time_duration l_diff = l_end - l_start;

std::cout << l_diff.total_milliseconds() << std::endl;

return 0;

}

Use lexical cast...

And about the exception handling, be smart the speed of this code is not affected by making the function throw the exception up, or by making the whole loop handle the exception once, be smart... and stop mucking about with conversions in C, you're only fooling yourself they're faster than other options.