Showing posts with label example. Show all posts
Showing posts with label example. Show all posts

Wednesday, 29 March 2023

A mini-break in C++ Optimization (Pt 1)

I have had one of those evenings, in reviewing some code for a friend I came across a repeated pattern where he would iterate over collections, strings and other linear items in a loop and mutate them.  We're of course talking C++ here, and he had this whole myriad of loops all over sanitizing, processing, event summing values.

I asked whether he had access to the STL he immediately got a bit uppity, he and I have never seen eye to eye about using the STL.

I advocate it be used, even just standing something up which you go back and make more performant or more memory efficient later, just stand up your project, don't get bogged down in details of the best string, or the best vector.

He however stands by his having "long tested examples I use all the time, just ignore them"...

No.

Simply put his examples, used in a smattering of programs, peer reviewed very infrequently by a miniscule handful of people is not good enough, use the STL, it is seen by literally millions of engineers, it is very portable and well adopted, use it to your advantage.

We sparred a little about this, but then I simply asked whether he had any parallelism in his algorithms?

Nope, of course STL can just use them.... You can provide the execution policy.

Lets take one of his examples, a simple function to give him the time & date as a character array.

char* GetDateTimeStringHis()
{
time_t now = time(0);
char* buffer = new char[64];
memset(buffer, 0, 64);
ctime_s(buffer, 63, &now);
for (int i = 0; i < 64; ++i)
{
if (buffer[i] == '\n') buffer[i] = 0;
}
return buffer;
}

This returns a character array buffer raw pointer.... No, he interjected, it returns a string.  A raw pointer to memory is NOT a string folks, this is where my argument on that point begins and ends.  This code is very unsafe, I don't like it and I certainly insist it can easily leak memory, for whom is going to delete that buffer?

So I erase that buffer with an actual standard string:

        std::string GetDateTimeStringV1()
{
time_t now = time(0);
std::string buffer(64, 0);
ctime_s(buffer.data(), 63, &now);
for (int i = 0; i < 64; ++i)
{
if (buffer[i] == '\n') buffer[i] = 0;
}
return buffer;
}

he didn't like this, but accepted perhaps that was better, he argued about the string constructor taking a length and a character (zero) to initialize itself with, but went with it.  It's safer, simpler, easier to read code.

I'm still not happy, I don't like that loop:

        std::string GetDateTimeStringV1()
{
time_t now = time(0);
std::string buffer(64, 0);
ctime_s(buffer.data(), 63, &now);
std::transform(
buffer.begin(), buffer.end(), buffer.begin(),
[](const char character) -> char
{
if (character == '\n')
{
return 0;
}
else
{
return character;
}});
return buffer;
}

He hated this, and I have to be honest I don't like the lambda either, but this is correct, using the transform algorithm and the lambda could be expressed in a location it could be reused, or just as a free function somewhere, it doesn't have to be inline here making a mess of reading the code.

What however is it doing?  It changes any new line to a zero in the buffer string, why?  Well ctime_s puts a new line at the end, so if we reverse the iteration we may get a much better performance we can early out of the loop.  But we can even just do away with the whole transform, we know it's the last character so just set it to zero as an optimization and simplification.  My friend can get behind this now:

        std::string GetDateTimeStringV2()
{
time_t now = time(0);
std::string buffer(64, 0);
ctime_s(buffer.data(), 63, &now);
buffer[std::strlen(buffer.data())-1] = 0;
return buffer;
}

We can even further think about this code, what will strlen do?  It might count from 0 to N to find the first zero character as the value, could we therefore count backwards from the 64th position in the buffer?  Well, we know the format of the data and time we get is known:

        Wed Mar 29 23:36:01 2023

Twenty five characters with the new line, adding another for the null termination character, so we can always just set the twenty fourth to zero and reduce the buffer size at the same time to perfectly fit!

        std::string GetDateTimeStringV3()
{
time_t now = time(0);
std::string buffer(26, 0);
ctime_s(buffer.data(), 26, &now);
buffer[24] = 0;
return buffer;
}

I feel I'm getting somewhere, I just want to make sure we use the STL version of the time functions now and use brace initialization for the "now".

        std::string GetDateTimeStringV4()
{
std::time_t now{ std::time(0) };
std::string buffer(26, 0);
ctime_s(buffer.data(), 26, &now);
buffer[24] = 0;
return buffer;
}

This ended my concerns and improved the performance of the code, he was calling this function nearly every time he logged, nearly every time he sent a network message and for very many of his database transactions.  This was 10 minutes of just thinking about the problem.

Now... Back to <algorithm> and execution policy with him...

Sunday, 19 February 2023

Home Game Engine : Physics Debugging Tool Progress

The last week has been invested into the Physics Debugging tool, it connects to the main game world; whether that is running in the client with it's own vulkan renderer, or into the server, for I am planning to have the server be authorative over player position at least, to prevent some of the strange hackery doo of the client being in charge.

Here's a silent video of that progress, as I'm recording from my Ryzen Workstation and I don't have a mic in here.

There is actually a lot going on in the background here, though the scene looks largely unchanged.

The most obvious addition since the last scene update is the replication of the rotating object, which is just a box, but that is being rotated on the client.  My controls here send a message to the client, which then sends the status update to it's GameObject.  The same frame the GameObject queues a message out and the scene updates to the physics representation; which means I get near real time (minimum of network delay + 1 frame) of the replication of an object back and forth.

However, I don't plan on expanding that too much, the number of messages is getting silly.  For example, I have an object for a position update (3 floats) then I have one for a position and a rotation (6 floats) and then a whole other one for position, rotation and scale (9 floats) which I call a transform.  I could just replicate a transform each frame, but then the amount of data gets very much larger.

I also have flash backs of delivering data from a server on a provisioned box, where you pay by the megabyte per monthly usage (and I honestly don't know why I couldn't host a service like this server publically on my massive 500mbit home fiber, which has no cap on data, it's all I can eat) but practically, I'd not like to host this here, except for debug, and if I were to make a game, it'd need to be on an AWS instance or something.

Anyway, that's all very much future stuff, my next problems are all about the game, improving and cleaning up my game object authoring, improving my model making skills (I might actually have to double down and actually learn Blender) and the working out a few issues I know exist in my control scheme.

More about all that in March though, for the rest of February, I have to tidy this stuff up.

If you want to know more, or follow some of my other older projects and videos, my YouTube Channel exists https://www.youtube.com/@LordXelous and of course this blog is always ticking over!

That Subscribe button really helps!

Wednesday, 25 December 2019

C++ Programming : SDL2 Graphics Rendering Loop

Today, it's a bit of a programming ramble, I'm literally throwing this code together, from memory and the docs... What does it do?

Well, I set up an SDL2 based graphics project.  Added font loading and set up the compilation environment and get something updating on screen, at the end there's then a second short video debugging a flickering... or if not properly fixing it, I at least start to pull the thread as to what is going wrong.

You can find the final code here: https://github.com/Xelous/Frog


And the short follow up:


Monday, 25 March 2019

C++ Variadic Template Compiler Optimizations

In my prior post we briefly talked about how the compiler unrolls a variadic template and we needed to provide the end-point function to stop that iteration, lets take a closer look at how the compiler does this unrolling.

The impact this has on your debug build is critical, and I demonstrate that you can see the unrolling of numeric functions (such as summation) but then in release the compiler is smart enough to optimize that all away and use a constant.  Understanding when this is happening is a crucial lesson, as we then see the action of a repetitive string output.


If you found these two little videos of any use, please life & subscribe on YouTube to let me know!

And as always the comments below are available.

Sunday, 24 March 2019

C++ Variadic Template Parameter Packing Example

Today in the brief few mintues I hate, I wanted to welcome all the new readers the blog is getting, there are nearly 5000 of you folks passing through these pages each month!


And now the code video....

Today we go over a usage example for variadic templates, demonstrating the new C++17 parameter packing, how to call functions with the parameter pack, how to mix parameter types with the template and control the compilers iterative unrolling of your code.

Find the source code here:

Find Fedor Pikus's new book here:


Please Note: Filmed in one take, during the single half-hour I have spare, so don't complain about the low production values, cus the code does at least work!

Sunday, 20 January 2019

C++: Undefined behaviour from realloc and the Clang Optimizer

I was wondering around in some system code and found a strange behaviour between two modes in a piece of code, where the code switches from double to triple buffer mode there's a problem, we've already got two buffers of everything we just want to allocate another but the underlying structure for sets of buffers wants to own them all... So the code went from:

SetOfBuffers
{
Buffer* one;
Buffer* two;
}

To:

SetOfBuffers
{
Buffer* one;
Buffer* two;
Buffer* three;
}

Creating the third buffer is fine:

SetOfBuffers::three = malloc(X);

But, the first two is a re-alloc, to reuse the existing buffer:

SetOfBuffers::One = realloc(OldSet::one, X);
SetOfBuffers::Two = realloc(OldSet::two, X);

The problem?  I'd start to modify the values in the new set of buffers, the third buffer being used and present.  Then the first buffer would be changed present... The second buffer changed present and the information is wrong (I over simplify massively here).

Anyway, I was remotely SSH'd into my server for this, so I went to Visual Studio, same code... Worked fine... So I go into my local VM and it's fine too, so I went back to the server and compiled manually and suddenly it's fine too.... WTF.

I literally spent an hour looking at this, the problem?  Well, it appears to be a bug in Clang, the reason the problem disappeared was my Makefile contains a $CC constant for the compiler to use and it was "clang" when I built by hand I used "g++".  Worse still, if I switched to a clang debug build the code worked fine, so this was something about my compilation process not a bug in the code per se.

So, perplexed I went in search of an answer.  And it appeared to be something about the clang optimizer, about which I found this talk from CppCon 2016.

Where there's this example:

#include <cstdlib>
#include <cstdio>
int main ()
{
int* p = (int*)malloc(4); // The original buffer above
int* q = (int*)realloc(p, 4); // The new pointer to the same old buffer
// Allocate a vlaue
*p = 1;
*p = 2;
if ( p == q )
{
printf("%d %d\n", *p, *q);
}
}

What do you expect this code to display?... Well, I expect it to print "2 2".  And it does on VC and G++ and even clang without the optimizer...





But you optimize the compile and its wrong:


Now, this is undefined behaviour and not caused by your code, it's the optimizer and very scary.  Not least as this was identified a while back (the talk along is from 2016) and g++ has solved the problem... Eeeek.

Saturday, 21 July 2018

C++ : Coding Standards by Example #1 (with Boost Beast)

Today I've spent sometime doing a little different video, rather than just play about in code, whilst I played about I made a video of the time I had... Here is it...


We cover:


  • C++
  • Coding Standards
  • Structured Programming
  • Functional Programming
  • Encapsulation
  • Object Orientated Programming
  • Building the Boost Libraries (1.67.0)
  • Visual Studio 2017
  • Static Linking

And generally spend time seeing how I turn gobble-de-gook code into something usable and maintainable against my own Coding Standards.

Get the code yourself here: https://github.com/Xelous/boostBeast

Thursday, 19 July 2018

C++ : Copy Constructor versus Ignorance

I've spent a bit of time reviewing someone else's code recently and I've come to an impasse with them, so they have a lot of code which will take some standard container, and the code doesn't just initialise the local copy from the passed in reference... No it'll iterate over the list of elements adding them to the class version.

I have picked fault with this, it's not RAII, it looks ugly and if you're threading you can create your class instance and the member is empty or in a partially filled state before the loop within the constructor is finished... I highlight this in red below...

My solution?  Just initialise the member from the reference - see the green highlight in the code below.

My results from the timing below?



These times are "microseconds" so tiny... But with just constructing from the existing reference we always get a lower time, quicker code...


Running this test 30,000 times, trying it in different orders and with maps of upto 1000 elements I had a rough average increase of 60% speed by using the copy constructor of std::map rather than reallocating new memory for each pre-existing element.

I wanted therefore to understand the reasoning behind the original code, was there some reason to perform a clone (alloc and assign new instance) operation for each pair in the map being passed in?  Looking at the code there was no apparent reason, so I spoke to the developer, asking why they had performed the construction in this manner... Their reply...

"That's how you initialise a container"

You can load up a container in this manner, but you already have the contents of the map, you're just copying it... "Why don't you use the copy constructor?"... I asked...

"I didn't write one"

Hmm, "You don't, the compiler generates it for you, std::map has its own copy constructor".  Use the copy constructor folks, trust me.


#include <map>
#include <chrono>
#include <string>
#include <iostream>


using Mapping = std::map<int, int>;

class A
{
private:
Mapping m_TheMapping;

public:

A() = delete;
A(const A&) = delete;
void operator=(const A&) = delete;

A(const Mapping& p_Mapping)
:
m_TheMapping(p_Mapping)
{
}

A(const Mapping& p_Mapping, const bool& p_Other)
:
m_TheMapping()
{
for (auto i : p_Mapping)
{
m_TheMapping.emplace(i);
}
}

inline Mapping& Mapping()
{
return m_TheMapping;
}
};


int main()
{
Mapping l_m
{
{ 0, 0 },
{ 1, 1 },
{ 2, 2 },
{ 3, 3 }
};

auto l_time(std::chrono::high_resolution_clock::now());
// Copy Construction
A l_map(l_m);
auto l_timeB(std::chrono::high_resolution_clock::now());

auto l_Dur(l_timeB - l_time);
std::cout << std::chrono::duration_cast<std::chrono::microseconds>(l_Dur).count() << std::endl;


auto l_time2(std::chrono::high_resolution_clock::now());
// Clone Construction
A l_map2(l_m, true);
auto l_time2B(std::chrono::high_resolution_clock::now());

auto l_DurB(l_time2B - l_time2);
std::cout << std::chrono::duration_cast<std::chrono::microseconds>(l_DurB).count() << std::endl;
}

Monday, 26 March 2018

C++ : Pass-By-Reference Or Die

Before today's Post, I'm on a mission folks, to get 1000 subs on YouTube.  If only 5% of viewers here subscribed we've have met this target in one month...



I've just had group code review of one of my personal projects, and been rather surprised by the vitriol levelled at one of my practices.... Pass by Reference.

The reviewer, one of a group of peers, has had major issues with the project (my personal) insistance on passing by reference wherever possible, in C++ this takes the form of an additional ampersand on parameter definitions; maybe this was the chaps problem, he has to type an ampersand?

So his problem?  Well, without the actual code we'll simplify and use the Compiler Explorer (from Godbolt.org) and we'll take up their basic square function example, it starts up thus:

Giving the assembler:


On the right, and this chap had taken time to prepare a whole slide show of functions, usually simple, and present them at this code review, showing this kind of thing.  His point... Well the very same C++ but with a pass by reference:


Turns up more lines of assembler:


He's got me right, right, I'm taking more time, I'm slowing everything down, by my not taking a copy of everything and using less memory I'm slowing things down....

This is where the sort of power play turned, I allowed him to present everything, I never interjected, never spoke, I allowed him to speak to the whole group.  We've hired a venue for this, we're meeting live for the first time.  This has to be good.... A couple of the chaps who can already see the fault in the complainers logic were smirking, but we let him finish.

Triumphant, he has won the day, he will not carry the torch of coding standard gods...  WRONG.

I pulled over the presentation laptop, opened godbolt.org myself... Added the ampersand to the "num" and let it produce the above assembler... The chap was smirking completely from ear to ear, he knew he had me...

And then I typed three characters....

-O2

Yes, I told the compiler to optimize, and this happened...


Remarkably small code wouldn't you say?  I still haven't spoken, but I turn the laptop back to the presenter and just sit there.

There's a noticable snigger from those in the know, older-wiser heads then my own I hasten to add.  But this young chap is now looking from me to the screen to the overhead projection and back with a mix of fury and completely puzzlement, he'd checked everything, he's dotted every j and crossed every t, he had me down pat, he wanted to usurp me.

Except, he's never ever, been willing to listen, to learn or to experiment, "code runs, that'll do" is very much his style (and Kyle if you're reading, yes I'm talking about you) but getting code to run is not enough, understanding the code you've written is often only just enough, but getting it to run everywhere, the same way, that's an art.  Debug, Release, Optimised, Unoptimised, automatically profiled, link database and continune they're all subtly different.  Just listing one thing out, the only thing you've looked at; because it backs up your point of view; is not enough you have to look around and see the holistic picture.

And optimised without a pass-by-reference?


Spookily similar code in this case, but often times pass-by-reference is prefered, using const-correctness is prefered it communicates a meaning.

For instance in the "square" function above, how does the caller know that the parameter "num" is not altered in value?  How does the caller know it returns the new value only?  It could be returning an error status code and the parameter altered in value!  You don't know, but making the parameter const and a reference you start to communicate more firmly the intent of your code.

Thursday, 8 March 2018

Sex, Secret or God... Passwords

In the 1990's it was common to have to tell folks not to use "popular" password, like "sex", or "god", believe it or not even "password" and "secret".  Since then times have moved on, folks have become very adept at using other characters in their passwords...

Unfortunately, this is (very seriously) what one of our IT bods here has just found on a machine:


Props to the user for mixing in some numbers, a word and a symbol, however... We can all see the flaw in their storing the password.


(Thanks to our IT Manager for letting me use his picture - it is a lovely left hand, I wonder if he does hand modelling?)

Wednesday, 31 January 2018

C++ boost::replace_all Slowed Me

I've just had myself into a complete frazzle, code yesterday was fast, rendering 120fps, the same code today was struggling to pass 11fps.  And it had me pulling my hair out, of which I have little left to spare.

The problem?

I had been processing strings for specific patterns and only replacing them, so a variable in my engine might have the value "SCORE" and it'd replace the players score value directly into the std::string instance for display.

I however decided I wanted to compound all these reserved words to allow any value to contain any replaced value and also contain formatting, so I could so something like "You have scored %SCORE% today" and it'd just place the number in place.

I turned to boost for this, boost::replace_all, to be specific, and I had about 45 pattern matches which would try to replace any instance of the string in place.

However, this function does not look a head if the predicate is present in the source string, it's in fact very slow.

So code:

const std::string l_Pattern("%SCORE%");
std::string l_Source("You have scored %SCORE% today");
.
.
.
boost::replace_all(
    l_Source,
    l_Pattern,
    Player::Instance()::ScoreAsString());

Would result in very slow performance, my solution is not perform the replace... search for the pattern predicate first:

if ( l_Scource.find(l_Pattern) != std::string::npos )
{
    boost::replace_all(
        l_Source,
        l_Pattern,
        Player::Instance()::ScoreAsString());
}

This latter code runs so much more quickly, I'm far in a way back a head of the speed curve, but I have this lovely dynamic placing of the variables into my rendering controls, and any control can receive any value just by my tweaking the loaded display script, so neat....

Anyway, I hope that helps.  If you want to help me, pop over to my YouTube channel and hit that subscribe button.

Friday, 5 January 2018

Bad Code...

You know when you open someones code and you find this inside a class function...

unsigned short currentMessageOffset = this->CurrentMessageOffset;

That you're in for a rough ride.

First of all, WHY assign the member value on the right to a local?  No-where else in the class is this value edited or read, so there's no need to assign it to anything.

And then the naming, this naming of something local with the same name, and it is the same name, despite the capitol leading letter, it's the same name, just makes this utterly useless.

All this before even mentioning that the code is an assignment not an allocation, so annoying.

Sunday, 31 December 2017

Using Flash Drives in ZFS Mirror

This post comes from an idea I had to allow me to easily carry a ZFS mirror away from a site and back again, we didn't need much space - only 5gb - but it had to be mirrored in triplicate, one copy to stay locally, one going into a fire safe on site and the third to be carried by the IT manager off-site each evening.

The trouble?  A near zero budget, so for a little over £45 we have a 14GB ZFS mirrored pool, across three 16 GB USB Flash drives and one three port USB 3.0 hub.

It was perfect for the task at hand, extremely portable, and cheap, I thought the same approach may help anyone trying to get to learn a little more about ZFS, a student or even someone using a laptop as a small office server - as the laptop literally has its own battery back-up system built in!

It's not the fastest solution, its in fact extremely slow, but as an entry step it's perfect.

See the full video below, throughout the commands I list were in use...



Commands:

Listing Disks by ID...

ls /dev/disk/by-id

Listing Disks to a file for use in a file script as you see me using...

ls /dev/disk/by-id -1 > disks.txt

------------------

To install ZFS on Debian/Ubuntu linux:

sudo apt-get install zfsutils-linux

------------------

To remove & purge ZFS from your system:

sudo apt-get purge zfsutils-linux

(and you will be left with "/etc/zfs/pool.cache" to remove or back up yourself).

------------------

Command to create the pool...

sudo zpool create <Name> mirror <DiskId1> <DiskId2> etc...

The name we had here was "tank", if you already have data on these disks you will need to add "-f" to force this change through.

------------------

Command to make a file executable - like our sh script:

sudo chmod +x <filename>

------------------

Zpool Commands:

sudo zpool status

sudo zpool import <name>

sudo zpool scrub <name>

sudo zpool clear <name>


You will want to "import" if you completely remove ZFS or move one of your sticks to a new machine etc, simply insert the disk and import the pool by name.

Scrub will be used whenever you return a disk to the pool, remember the point here is to allow you to replicate the data across the three sticks and be able to remove one or two to safe keeping, be that an overnight fire safe, or taking a physical copy with oneself.

Clear is used to remove any errors such as the Pool becoming locked out for writing - which it may if a drive, or all drives are removed - you simply clear the current problem with any pool.


Summary:  Remember this is NOT the optimum way to run ZFS, this is actually extremely slow, you are replicating each write over your USB, you can only cache so much in the RAM, but it is not a performance piece, this is about ensuring one replicates data for safe keeping, a small office or your dorm room server setup could be completely provided by a laptop in this manner, it has it's own battery backup, it is quite (if you get the right machine) and really this is a very cheap way to play with ZFS before you move onto other bigger hardware options.  Plus, I find the best way to learn about technology is to break it, even a little, and so constantly breaking down your pools by pulling USB sticks out of them is an excellent opener to recovering your pools.  Play about first, don't put anything critical on there until you're really happy with the results.

For an excellent post covering creating ZFS pools, cheak out programaster's post here: http://blog.programster.org/zfs-create-disk-pools

And for the official ZFS documentation you can check things out with oracle here: https://docs.oracle.com/cd/E26505_01/html/E37384/toc.html


Oh, and Happy New Year... I guess I made it to 100 posts this year...

Wednesday, 27 December 2017

C/C++ Stop Miss using inline.... PLEASE!

This is a plea, from the bottom of my rotten black heart, please... Please... PLEASE stop miss using the inline directive in your C and C++.

Now, I can't blame you for this, I remember back in the 90's being actually taught (at degree level) "use inline to make a function faster", and this old lie still bites today.

inline does not make your function faster, it simply forces the compiler to insert "inline" another copy of the same code whenever you call it, so this code:

#include <iostream>

inline void Hello()

{
    std::cout << "Hello";

}

int main ()
{
    Hello();
    Hello();
    Hello();
}

Turns into the effective output code of:

int main ()

{
    std::cout << "Hello";

    std::cout << "Hello";
    std::cout << "Hello";
}

What does this mean in practice?  Well, you saves yourself a JMP into the function, and the position on the stack holding the return address, and the RET from the function which pops off the stack and returns from the function.

This is WHY people were told to use inline to make things faster in the 90's, I was taught this when I a system with around 254K of working RAM for the programs I was writing, saving that space on an 8K stack was important in complex systems, especially if you were nesting loops of calls.

However, today, on a modern processor, even modern embedded processors, DO NOT DO THIS!

You're no longer saving anything, you're in fact making your code bigger and slower as suddenly your program expands in size and you are having to fetch more and more from the slower RAM layers rather than the program instructions page fitting into the lower CACHE layers.

As you get page misses you fetch more, you literally stop the program and switch context to another item and then switch back, literally halting your program in its tracks as it suddenly had to go load the N'th of possibly thousands of repeated stanza's of code.

Don't do, this, don't lumbar yourself, let the compiler handle it's own optimizations, they're pretty good at it!

Now some of you will be saying "yeah, no shit Xel, what's your point?"... My point is I recently had around 4000 lines of code handed to me, a huge long listing, and around 40% of it was a series of functions.  This whole thing could compile down to around 62K.... But when compiled it was just over 113K... This was too big to fit into the memory of the micro-controller it was for.

The developer had been working merrily over the yule tide, happy and satisfied their code would work, they went to work this morning and instead of running the code on the IDE within an emulator, they actually ran it on the metal.

It crashed, and they couldn't figure out why, the size was why.

And then they couldn't work out why the code was so big... It is tiny code.

They came, cap in hand, to myself - and I took no small satisfaction in rolling my eyes and telling them to remove the "inline" from EVERY function... "But it'll run so slowly" they decried... "REMOVE THE INLINE".

Of course it works, they have the system fitting into the micro-controller RAM, the stack is working a lot harder, their code is a lot smaller, and they are now in possession of a more balanced opine on "inline".

* EDIT *

One person, yes hello Hank, asked me "why", why was this a not a problem on the emulator, but was a problem on the bare metal, well the bare metal was using a different compiler than the pseudo compiler for the windows based IDE, the Windows based IDE was actually running the code through a compiler which ignored "inline", and so produced code a little like this:

(Image Courtesy "CompilerExplorer")

You can see that even though "int square(int)" is marked "inline" it contains the push to the stack and the "pop ret" pairing, and making it a call from main results in two function calls to the same assembler.

The bare metal compiler did not, an undocumented difference I might add.

Saturday, 16 December 2017

C++ : The unrandom random number...

I've been working in some C++, with boost to be precise, the machine I'm working towards finally has a processor with SSE3 in it, and so I've been to revisit the GUID generation code, boost specifies a couple of defines you can set up before incluiding the uuids header to help...

#include <iostream>
#ifndef BOOST_UUID_USE_SSE3
#define BOOST_UUID_USE_SSE3
#endif
#include <boost/uuid/uuid.hpp>
#include <boost/uuid/uuid_io.hpp>
#include <boost/uuid/uuid_generators.hpp>
#include <boost/lexical_cast.hpp>

const std::string GetGuid();

int main ()
{
for (unsigned int i(0);
i < 100000;
++i)
{
std::cout << GetGuid() << "\r\n";
}
}


const std::string GetGuid()
{
boost::uuids::uuid l_guid =
boost::uuids::random_generator()();
return boost::lexical_cast<std::string>(l_guid);
}

This code looks fairly innocuous, "GetGuid" is the key part, you may argue that you're always setting the random number generator up, each and every call, but the output is fairly simple, this is only a test....




However, if we look carefully, there is always one column the same, when running on the screen this is very obvious...


Generating hundreds of thousands, taking minutes and minutes hasn't changed that one character.

Why this is isn't clear to me, I need to do some more digging.  I'm going to hazzard a guess it's that we construct and release the number generator each pass, we should perhaps instantiate one and keep it, so the sequence of randomness is preserved.

Any suggestions?  Hit the comments below!




P.S. Yes, I know I've not used RAII with the l_guid assignment there, but I'm in a hurry and only just noticed.