Friday, 15 February 2019

Coding Standards - About Scopes (again)

Consistency is good, consistency is light, consistency is really what you need think about when creating code, no matter what format, language, platform or project you're on do things the same way, and even on your first pass do them your way consistently before you apply any locally demanded view of the code....

What do I mean by view?  Well, I'm talking about coding standards, but the non-functional side of things, hence the view.  You can look at code from any angle you choose, indeed there are multiple ways to achieve the same solution within code, the difficulty is to get things out there, put them down, then sort the wheat from the chaff.

I've done this in a couple of videos, where I move pieces of code from their own coding layout to my own, I have labelled these "coding standards", but in retrospect they've just the coding view.

One of my big points is the notation for the scope of variables, I love to use scoped notation.  Having been brought into programming when Hungarian notation was not only taught to you academically as required, but demanded by all projects you worked on (basically as you had no code completion tools, so knowing the type of a named variable whenever you utilised it was really very useful) but that was twenty plus years ago, now we have code completion I strongly feel your tools should be made to work for you.

Therefore, when I complete a call or variable name, I want that to communicate something to me, and the most useful is to check the scope, the code completion tells me the type and the name, but scoping isn't really fully covered in the tooling (please post below any suggestions for Visual Studio, Atom, CodeBlocks or Visual Studio Code which perform scope notation or checking on the fly).

When I complete my typing and pause I want to see all the m_ members together, all the c_ constants all the s_ statics and so many coding standards documents define these three.  However, it's just not far enough, what happens when you have locals named 'angle', 'view' and 'Gregorian'?  They're at very different points within the code completion window which pops up, you have to remove your hands from the keyboard to begin scrolling through them; or worse still start typing letters in order to guess the name (for instance we may start to type a c to see if 'Gregorian' comes up actually as 'calendar').

My solution?  The solution I've been pushing time after time, post after post, video after video?  Scope more things.

We can add "local" as "l_", parameters as "p_", we can compound them, so a static constant maybe "sc_" and we can tailor this in whichever manner we feel fits best, but be consistent.

Let me be utterly clear, this is not Hungarian notation of old, we are not adding "i" to integers or "l" to longs, we are not defining the type within the name, we are simply defining the scope and helping the tools to help us by grouping the various scopes of variables together.

I find it incredibly hard to read documents which go half way, defining members as "m_" (or just a leading m) is extremely common, ditto for constants and even statics, it's only with parameters people get a little squeamish.  "p_" is sometimes reserved for pointer, but that's a style edging back to Hungarian you're notifying the reader of the type and use rather than the scope, therefore I push "p_" as a parameter.  But I can also see arguments for "i_" and "o_" for parameters, which tell you it's an input or output to the function.  If one wanted to be a pedant you could go so far as to stress this with "pi_" and "pi_".

This isn't to say I'm for everything, for example in returning from a function:

int foo();

I will not define an "r_" as the return scope, I'm not interested in the semantics to that extent, and I don't feel code like this:

int foo()
{
int r_result(0);

... Some code to set r_result

return r_result;
}

Is communicating anything useful, so often in my code you will find:

int foo()
{
int l_result(0);

... Some code to set l_result

return l_result;
}

As the result remains a local until returned.

Which brings me back to my second bug bear of the month, one which I've already blogged about returning early.  I'll let you remind yourselves of my opinions.




And just a note to the world at large, if you're coding "m_" for members, but ALSO add Hungarian notations "m_iInteger" stop, stop it now... and yes that includes when you're doing pointers "int* m_pIntegerList"... argh, I want to pull my own arm off just to have something to throw at the screen when I see this one, it's a hybrid as annoying as the Alien in Alien Resurrection.


What's my solution?... Well the completely sane idea I use is "m_IntegerListPtr"... Yes, it's a pointer... I tell myself it's one, I'm not strange at all... Alright, maybe a little.


Sunday, 10 February 2019

The Lost Router

Its gone... Just gone.. Puft... Up and gone... I remember it in the box, I remember packing it... Puft, gone, where is it?... Have you seen it?....

LOST


LinkSys WRT AC1900

Responds with the SSID "XelNet"

Likes to sit on the side serving internet pages to the house and the occasional belly rub....


It's just lost.. I've come this weekend to sort out my network situation, so I've finally gotten an afer market ADSL modem to replace the BT Home Hub, I've got a powerline network adapter to carry network upstairs to the wired office (rather than running wires through ceilings just yet).

I've gotten the hub for the garage and the hub for the office sorted... And all the smaller ethernet cables all crimped to perfect length to stop wires trailing all over under my desk and behind the TV.

But, then I go to find my router and it's just missing... I remember boxing it back in it's own box on moving day, I remember doing that.... I remember it distinctly.

I think it went into my car, not the wifes... My car is now clearly empty, I emptied it... So I've been through EVERY box in the garage three times, been through the loft twice, I've been through every nook and cranny I can find...

My thoughts are that maybe, perhaps, it's at the in-laws house, I've asked them to look and they just said "no"... But I think it's there by their PC and they don't realize the box which looks like theirs is mine.

So frustrating!


--------

Update, the wife came with to go through EVERY box in the garage with me... Guess what... No, no... She didn't come up smug (I knew you'd think she would)... The thing is still LOST... GONE LOST VANEEEEEEEESHED.... Gah.

I think I have to go look at the in-laws.

Thursday, 31 January 2019

C++ Coding Standards: Don't Return Early

I've just had one of those mini-conversations which has be scratching my head, I asked a coworker about "return early" being part of the coding standard or not, as it was being discussed by other parties.

His response was an emphatic, slightly arrogant and touchily obtuse, "I have no time for anyone arguing for not returning early from a function you've discovered you should not be in".

I however don't take such a black and white view, I see the pro's and con's of both approaches, importantly not returning from a function can, and is part of, driving process flow conversations and it aids in picking out structure which can be encapsulated away.  The result being that instead of instantly exiting and potentially leaving spaghetti code you can see a flow, see where branching is occurring and deal with it...

But, that said, it must be understood "returning early" is often seen as "faster", as the chap argued "I see no point being in a function you know you should already have left", so I took to compiler explorer... In a very trivial example:

 

Here we see a decision being made and either a return or the active code, easy to read, simple, trivial... But don't dismiss it too early, is it the neatest code?

Well, we could go for less lines of code which generates near identical assembly thus:


This is certainly less lines of C++ however, inverting the test is a little messy and could easily be overlooked in our code or in review, better might therefore be expanding the test to a true opposite:


One could argue the code is a little neater to read without the inverting, critically though it has made no difference to the assembly output, it's identical.

And it is identical in all three cases.

You could argue therefore that "returning early has no bearing on the functionality of the code", but that's too simplistic, because "not returning early" also has no bearing on the functionality of the code.  The test and action and operation has been worked out by the magic of the compiler for us to be the same.

So with equivalent operation and generation we need think about what returning from the function early did affect, well it made the code on the left longer, yes this is an affectation of our coding with braces, but it was longer.  You could also see that there were two things to read and take in, the test was separate to the code and importantly for me the test and the actual functional code were on the same indentation horizontal alignment.  Being on that same alignment makes your eye not think a decision has been made.

Putting the test and the action of that test into the inner bracing communicates clearly that a decision has been made and our code has branched.

And when that structure is apparent you can think about what not returning early has done, well it's literally braced around the stanza of code to run after the decision, that packet of code could be spun off into it's own function do you start to think about encapsulation.  Of course you can think about the same thing after the return, but in my opinion having the content of the braces to work within is of benefit and most certainly does not afford any speed benefits.

Lets look at a more convoluted, but no less trivial example, a function to load a file in its entirety and return it as a known sized buffer... we'll start with a few common types:

#include <memory>
#include <cstdint>
#include <string>
#include <cstdio>
#include <optional>
#include <sys/stat.h>
#include <sys/types.h>
using byte = unsigned char;
using DataBlock = std::shared_ptr<byte>;
using Buffer = std::pair<uint32_t, DataBlock>;
using LoadResult = std::optional<Buffer>;

And we'll assume there's a function to get the file size, with stat...

const uint32_t GetFileSize(const std::string& p_Path)
{
    uint32_t l_result(0);

    struct stat l_FileStat;
    if ( stat(p_Path.c_str(), &l_FileStat) == 0)
    {
        l_result = l_FileStat.st_size;
    }

    return l_result;
}

Now, this file is return path safe because I define the result as zero and always get to that return, I could have written it thus:

const uint32_t GetFileSizeOther(const std::string& p_Path)
{    
    struct stat l_FileStat;
    if ( stat(p_Path.c_str(), &l_FileStat) != 0)
    {
        return 0;
    }

    return l_FileStat.st_size;
}

But I don't see the benefit, both returns generate an lvalue which is returned, except in the latter you have two code points to define the value, if anything I would argue you loose the ability to debug "l_result" from the first version, you can debug it before, during and after operating upon it... Where as the latter, you don't know the return value until the return, which results in the allocate and return.  And again in both cases the assembly produced is identical.

So, the load function, how could it be written with returning as soon as you see a problem?... Well, how about this?

LoadResult LoadFileReturning(const std::string& p_Path)
{
    LoadResult l_result;

    if ( p_Path.empty() )
    {
        return l_result;
    }
     
    auto l_FileSize(GetFileSize(p_Path));
    if ( l_FileSize == 0 )
    {                     
        return l_result;
    }

    FILE* l_file (fopen(p_Path.c_str(), "r+"));
    if ( !l_file )
    {
        return l_result;
    }

    Buffer l_temp { l_FileSize, std::make_shared<byte>(l_FileSize) };
    if ( !l_temp.second )
    {
        fclose(l_file);
        return l_result;
    }

    auto l_ReadBytes(
        fread(
            l_temp.second.get(),
            1,
            l_FileSize,
            l_file));

    if ( l_ReadBytes != l_FileSize )
    {
        fclose(l_file);
        return l_result;
    }

    l_result.emplace(l_temp);

    fclose(l_file); 

    return l_result;
}

We have six, count them (orange highlights) different places where we return the result, as we test the parameters, check the file size and then open and read the file all before we get to the meat of the function which is to setup the return content after a successful read.  We have three points where we must remember and maintain to close the file upon a problem (red highlights).  This duplication of effort and dispersal of what could be critical operations (like remembering to close a file) throughout your code flow is a problem down the line.

I very much know I've forgotten and missed things like this, reducing the possible points of failure for your code is important and my personal preference to not return from a function early is one such methodology.

Besides the duplicated points of failure I also found the code to not be communicating well, its 44 lines of code, better communication comes from code thus:

LoadResult LoadFile(const std::string& p_Path)
{
    LoadResult l_result;

    if ( !p_Path.empty() )
    {
        auto l_FileSize(GetFileSize(p_Path));
        if ( l_FileSize > 0 )
        {                        
            FILE* l_file (fopen(p_Path.c_str(), "r+"));
            if ( l_file )
            {
                Buffer l_temp { l_FileSize, std::make_shared<byte>(l_FileSize) };
                if ( l_temp.second )
                {
                    auto l_ReadBytes(
                        fread(
                            l_temp.second.get(),
                            1,
                            l_FileSize,
                            l_file));

                    if ( l_ReadBytes == l_FileSize )
                    {
                        l_result.emplace(l_temp);
                    }
                }                

                fclose(l_file);
            }
        }
    }

    return l_result;
}

This time we have 33 lines of code, we can see the stanzas of code indenting into the functionality, at each point we have the same decisions taking us back out to always return.  When we've had a successful open of the file we have one (for all following decisions) place where it's closed and ultimately we can identify the success conditions easily.

I've heard this described as forward positive code, you look for success and always cope with failure, whilst the former is only coping with failure as it appears.

I prefer the latter, ultimately it comes down to a personal choice, some folks argue indenting like this is bad, I've yet to hear why if the compiled object code is the same, you are communicating so much more fluently and have less points of possible human error in the code to deal with and maintain.

From the latter we could pick out reused code and we can target logging or performance metrics more directly on specific stanzas within their local scopes.  Instead of everything being against the left hand indent line.

Now, I will happily tell you it hasn't been a comfortable ride to my thoughts on this topic, I started programming on a tiny screen (40x20 characters - Atari ST Low Res GEM and when I moved to DOS having 80 character widths I felt spoiled.  Now we have tracts of huge screen space, arguing you need to stay on the left is moot, you don't, use your screen, make your code communicate its meaning, use the indents.

And yes, before you ask, I was doing things this way long before I started to use Python.

Sunday, 20 January 2019

C++: Undefined behaviour from realloc and the Clang Optimizer

I was wondering around in some system code and found a strange behaviour between two modes in a piece of code, where the code switches from double to triple buffer mode there's a problem, we've already got two buffers of everything we just want to allocate another but the underlying structure for sets of buffers wants to own them all... So the code went from:

SetOfBuffers
{
Buffer* one;
Buffer* two;
}

To:

SetOfBuffers
{
Buffer* one;
Buffer* two;
Buffer* three;
}

Creating the third buffer is fine:

SetOfBuffers::three = malloc(X);

But, the first two is a re-alloc, to reuse the existing buffer:

SetOfBuffers::One = realloc(OldSet::one, X);
SetOfBuffers::Two = realloc(OldSet::two, X);

The problem?  I'd start to modify the values in the new set of buffers, the third buffer being used and present.  Then the first buffer would be changed present... The second buffer changed present and the information is wrong (I over simplify massively here).

Anyway, I was remotely SSH'd into my server for this, so I went to Visual Studio, same code... Worked fine... So I go into my local VM and it's fine too, so I went back to the server and compiled manually and suddenly it's fine too.... WTF.

I literally spent an hour looking at this, the problem?  Well, it appears to be a bug in Clang, the reason the problem disappeared was my Makefile contains a $CC constant for the compiler to use and it was "clang" when I built by hand I used "g++".  Worse still, if I switched to a clang debug build the code worked fine, so this was something about my compilation process not a bug in the code per se.

So, perplexed I went in search of an answer.  And it appeared to be something about the clang optimizer, about which I found this talk from CppCon 2016.

Where there's this example:

#include <cstdlib>
#include <cstdio>
int main ()
{
int* p = (int*)malloc(4); // The original buffer above
int* q = (int*)realloc(p, 4); // The new pointer to the same old buffer
// Allocate a vlaue
*p = 1;
*p = 2;
if ( p == q )
{
printf("%d %d\n", *p, *q);
}
}

What do you expect this code to display?... Well, I expect it to print "2 2".  And it does on VC and G++ and even clang without the optimizer...





But you optimize the compile and its wrong:


Now, this is undefined behaviour and not caused by your code, it's the optimizer and very scary.  Not least as this was identified a while back (the talk along is from 2016) and g++ has solved the problem... Eeeek.

Friday, 18 January 2019

C++: The Dilemma of using Exceptions

As a C++ developer I've recently been in the midst of the Exceptions debate.  Exceptions as a feature of the C++ language have long divided the community, naively I've often put this down to their coming to the language from elsewhere (C perhaps) where exceptions were not the norm.

This was a mistake, for in the projects I've previously worked upon the overhead of an exception compared to the ramifications of the error resulting in a crash have often totally outweighed the overhead, making deciding to employ an exception exceptionally easy (*bum bum*).  Exceptions were the norm, still are and should be to you too (according to the C++ Core Guidelines https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Ri-except).

The difficulty, the dilemma if you will, comes not from the feature, but the cost.  If you application domain demands high performance you can not risk using exceptions, the argument carries on therefore whether in not using them to avoid that overhead in performance you are actually cutting off your nose to spite your face... What are the downstream costs if an unexpected, in tested for, error occurs and some strange behavior ensues, or worse it propagates up and out of your application to crash?!!

In the modern era of social media such feedback is near instantaneous, can you afford the egg on face?

And so you spin yourself around and dig further into the ground your problems.

Voices from industry do little to alleviate the problem, google for instance (https://google.github.io/styleguide/cppguide.html#Exceptions), almost seem to advocate using exceptions if from the ground up you can build them in.  You stand the cost of retraining developers and change your ways of working around (e.g. embrace RAII).

They even go so far as to state "the benefits of exceptions outweigh the costs", but then add the caveat "in new projects".  How old or new your project should not define your error handling mechanisms, surely?  Plus from experience I've always found a code base is as old or new as the minds working upon it.  If you have an engineer firmly rooted in C without exceptions they will continue to churn code in that style through your compiler (which will happily work with it), not because it's a better way of working nor because it's a worse way of working, quite often it's continued along simply under the inertia of experience that developer has and they've not time to stop to take stock.

Even given that time, which the project I've just begun working upon has, there's weight behind the past, the old way of working, what worked before, and if that was non-exception handling code people stick with it.  Tell those self same people "you really should use these, oh and you have to retrain your minds, plus it'll take longer to compile and more time to run" they're going to stare at you like you're mental.


The question I'm left with is therefore not whether to use exceptions or not (well I am left with this problem, and this post is no help) instead we're left with the question of when will the benefits of exceptions, their ease of use, simplicity to write and visualise out weight the retraining, compile time and runtime costs?... I don't think that's anytime soon, it should be after so many years, but it's simply not.

x

Tuesday, 15 January 2019

Cross Compiling Clarity

I have an important distinction to make clear... Cross Compiling is using a machine with a different processor type than the target machine for your program.

So, you have a PC with an Intel Core 2 Duo in it, and you use a compiler to output an ARM executable, that is cross compiling.

If you're on an Intel Core Duo toting 15" MacBook pro (from 2006) and you use that to produce the same ARM executable, that's still cross-compiling.  BUT if you use that MacBook to generate a windows executable it's not cross compiling, as the processor is the same in both targets, you're not crossing over...

Some folks argue this differently, because you're crossing between the Windows and Apple OS's you're cross-compiling.  In my book, and by the definition given elsewhere, you're not, in this case you are cross-platform (the platform being the OS) but you're not cross compiling because the processors are the same family.

That's all, I just wanted this out there, argue as you all see fit.


Sunday, 30 December 2018

Programmer Between Jobs...

Announcement time, I'm unemployed... Yes, as of the 28th December 2018 I have not been gainfully employed.  But fear not, I am starting my new job on Wednesday 2nd January.

I've covered this in a nattering brief on my YouTube, which I now link, but stick with this post for more information.


I've never really spoken directly about the job I've had and how its evolved over the years I've been with it, so I was at that role for 14 years.  That's a very long time, especially in today's markets, but as you all know I keep myself busy and informed on these very pages.

I did work in what's euphemistically called "the pay to play industry", this is the gambling industry to the rest of the world, but saying that I didn't write the actual games.  I was a systems engineer, so I wrote software which drove the money accepting and paying hardware, and especially software which worked to store the current state of a machine, avoid frauds, count, collate and communicate metrics to central servers or offline (USB) uploads.

And I did all this according to a set of quite strict legal guidelines and requirements, so we're not talking about the much media covered "crack cocaine of gambling the FOBT", we're talking at most about £1 stake machines, family game machines, questions and answer style stuff.  The more cuddly end of the market if you're asking me.

I am leaving that industry, all that knowledge of BACTA standards and other stuff has been slowly ebbing out of my pores and leaving me, as I actually resigned in early October and in late October (when it became public knowledge in the office I was leaving) they placed me on gardening leave.  We'll gloss over that as some folks got their righteous fury out and pointed it in my direction; needless to say, being on gardening leave this long has let the move happen seamlessly, but it will be a shock to return to early morning rising to get into Nottingham City Center... Plus, no more day time tele... Oh thank god, no more daytime TV!

So, where am I going?  Well, I'm going to a company called The Multiplayer Guys, this is an actual game company, a computer game, console game crew with close ties to an actual game house Super Punk Games.

Now I can't share anything about what I'm going to be working on, but I'm carrying my C++, Windows and Linux experience high on my shoulders.  I'm eager to learn more about the Client Side structure of modern game engines and leverage my server-side and networking experience with the high-speed low latency efficient demands of a modern game.

So, yes, I know what game I'm going to be working on, that's the bit I can't share with you, maybe over time, but right now, nada, nothing, no comment... Move along, move along.



P.S.  I got an e-mail at 16:42 that my Laptop shipped, from China (despite being told the Lenovo reseller was in the Republic of Ireland) by UPS Global... I'm wondering if it'll be here within the week now, I really am.