The Megalomaniac Bore: practices

Showing posts with label practices. Show all posts

Friday, 15 February 2019

Coding Standards - About Scopes (again)

Consistency is good, consistency is light, consistency is really what you need think about when creating code, no matter what format, language, platform or project you're on do things the same way, and even on your first pass do them your way consistently before you apply any locally demanded view of the code....

What do I mean by view? Well, I'm talking about coding standards, but the non-functional side of things, hence the view. You can look at code from any angle you choose, indeed there are multiple ways to achieve the same solution within code, the difficulty is to get things out there, put them down, then sort the wheat from the chaff.

I've done this in a couple of videos, where I move pieces of code from their own coding layout to my own, I have labelled these "coding standards", but in retrospect they've just the coding view.

One of my big points is the notation for the scope of variables, I love to use scoped notation. Having been brought into programming when Hungarian notation was not only taught to you academically as required, but demanded by all projects you worked on (basically as you had no code completion tools, so knowing the type of a named variable whenever you utilised it was really very useful) but that was twenty plus years ago, now we have code completion I strongly feel your tools should be made to work for you.

Therefore, when I complete a call or variable name, I want that to communicate something to me, and the most useful is to check the scope, the code completion tells me the type and the name, but scoping isn't really fully covered in the tooling (please post below any suggestions for Visual Studio, Atom, CodeBlocks or Visual Studio Code which perform scope notation or checking on the fly).

When I complete my typing and pause I want to see all the m_ members together, all the c_ constants all the s_ statics and so many coding standards documents define these three. However, it's just not far enough, what happens when you have locals named 'angle', 'view' and 'Gregorian'? They're at very different points within the code completion window which pops up, you have to remove your hands from the keyboard to begin scrolling through them; or worse still start typing letters in order to guess the name (for instance we may start to type a c to see if 'Gregorian' comes up actually as 'calendar').

My solution? The solution I've been pushing time after time, post after post, video after video? Scope more things.

We can add "local" as "l_", parameters as "p_", we can compound them, so a static constant maybe "sc_" and we can tailor this in whichever manner we feel fits best, but be consistent.

Let me be utterly clear, this is not Hungarian notation of old, we are not adding "i" to integers or "l" to longs, we are not defining the type within the name, we are simply defining the scope and helping the tools to help us by grouping the various scopes of variables together.

I find it incredibly hard to read documents which go half way, defining members as "m_" (or just a leading m) is extremely common, ditto for constants and even statics, it's only with parameters people get a little squeamish. "p_" is sometimes reserved for pointer, but that's a style edging back to Hungarian you're notifying the reader of the type and use rather than the scope, therefore I push "p_" as a parameter. But I can also see arguments for "i_" and "o_" for parameters, which tell you it's an input or output to the function. If one wanted to be a pedant you could go so far as to stress this with "pi_" and "pi_".

This isn't to say I'm for everything, for example in returning from a function:

int foo();

I will not define an "r_" as the return scope, I'm not interested in the semantics to that extent, and I don't feel code like this:

int foo()

{

int r_result(0);

... Some code to set r_result

return r_result;

}

Is communicating anything useful, so often in my code you will find:

int foo()

{

int l_result(0);

... Some code to set l_result

return l_result;

}

As the result remains a local until returned.

Which brings me back to my second bug bear of the month, one which I've already blogged about returning early. I'll let you remind yourselves of my opinions.

And just a note to the world at large, if you're coding "m_" for members, but ALSO add Hungarian notations "m_iInteger" stop, stop it now... and yes that includes when you're doing pointers "int* m_pIntegerList"... argh, I want to pull my own arm off just to have something to throw at the screen when I see this one, it's a hybrid as annoying as the Alien in Alien Resurrection.

What's my solution?... Well the completely sane idea I use is "m_IntegerListPtr"... Yes, it's a pointer... I tell myself it's one, I'm not strange at all... Alright, maybe a little.

Thursday, 31 January 2019

C++ Coding Standards: Don't Return Early

I've just had one of those mini-conversations which has be scratching my head, I asked a coworker about "return early" being part of the coding standard or not, as it was being discussed by other parties.

His response was an emphatic, slightly arrogant and touchily obtuse, "I have no time for anyone arguing for not returning early from a function you've discovered you should not be in".

I however don't take such a black and white view, I see the pro's and con's of both approaches, importantly not returning from a function can, and is part of, driving process flow conversations and it aids in picking out structure which can be encapsulated away. The result being that instead of instantly exiting and potentially leaving spaghetti code you can see a flow, see where branching is occurring and deal with it...

But, that said, it must be understood "returning early" is often seen as "faster", as the chap argued "I see no point being in a function you know you should already have left", so I took to compiler explorer... In a very trivial example:

Here we see a decision being made and either a return or the active code, easy to read, simple, trivial... But don't dismiss it too early, is it the neatest code?

Well, we could go for less lines of code which generates near identical assembly thus:

This is certainly less lines of C++ however, inverting the test is a little messy and could easily be overlooked in our code or in review, better might therefore be expanding the test to a true opposite:

One could argue the code is a little neater to read without the inverting, critically though it has made no difference to the assembly output, it's identical.

And it is identical in all three cases.

You could argue therefore that "returning early has no bearing on the functionality of the code", but that's too simplistic, because "not returning early" also has no bearing on the functionality of the code. The test and action and operation has been worked out by the magic of the compiler for us to be the same.

So with equivalent operation and generation we need think about what returning from the function early did affect, well it made the code on the left longer, yes this is an affectation of our coding with braces, but it was longer. You could also see that there were two things to read and take in, the test was separate to the code and importantly for me the test and the actual functional code were on the same indentation horizontal alignment. Being on that same alignment makes your eye not think a decision has been made.

Putting the test and the action of that test into the inner bracing communicates clearly that a decision has been made and our code has branched.

And when that structure is apparent you can think about what not returning early has done, well it's literally braced around the stanza of code to run after the decision, that packet of code could be spun off into it's own function do you start to think about encapsulation. Of course you can think about the same thing after the return, but in my opinion having the content of the braces to work within is of benefit and most certainly does not afford any speed benefits.

Lets look at a more convoluted, but no less trivial example, a function to load a file in its entirety and return it as a known sized buffer... we'll start with a few common types:

#include <memory>

#include <cstdint>

#include <string>

#include <cstdio>

#include <optional>

#include <sys/stat.h>

#include <sys/types.h>

using byte = unsigned char;

using DataBlock = std::shared_ptr<byte>;

using Buffer = std::pair<uint32_t, DataBlock>;

using LoadResult = std::optional<Buffer>;

And we'll assume there's a function to get the file size, with stat...

const uint32_t GetFileSize(const std::string& p_Path)

{

uint32_t l_result(0);

struct stat l_FileStat;

if ( stat(p_Path.c_str(), &l_FileStat) == 0)

{

l_result = l_FileStat.st_size;

}

return l_result;

}

Now, this file is return path safe because I define the result as zero and always get to that return, I could have written it thus:

const uint32_t GetFileSizeOther(const std::string& p_Path)

{

struct stat l_FileStat;

if ( stat(p_Path.c_str(), &l_FileStat) != 0)

{

return 0;

}

return l_FileStat.st_size;

}

But I don't see the benefit, both returns generate an lvalue which is returned, except in the latter you have two code points to define the value, if anything I would argue you loose the ability to debug "l_result" from the first version, you can debug it before, during and after operating upon it... Where as the latter, you don't know the return value until the return, which results in the allocate and return. And again in both cases the assembly produced is identical.

So, the load function, how could it be written with returning as soon as you see a problem?... Well, how about this?

LoadResult LoadFileReturning(const std::string& p_Path)

{

LoadResult l_result;

if ( p_Path.empty() )

{

return l_result;

}

auto l_FileSize(GetFileSize(p_Path));

if ( l_FileSize == 0 )

{

return l_result;

}

FILE* l_file (fopen(p_Path.c_str(), "r+"));

if ( !l_file )

{

return l_result;

}

Buffer l_temp { l_FileSize, std::make_shared<byte>(l_FileSize) };

if ( !l_temp.second )

{

fclose(l_file);

return l_result;

}

auto l_ReadBytes(

fread(

l_temp.second.get(),

l_FileSize,

l_file));

if ( l_ReadBytes != l_FileSize )

{

fclose(l_file);

return l_result;

}

l_result.emplace(l_temp);

fclose(l_file);

return l_result;

}

We have six, count them (orange highlights) different places where we return the result, as we test the parameters, check the file size and then open and read the file all before we get to the meat of the function which is to setup the return content after a successful read. We have three points where we must remember and maintain to close the file upon a problem (red highlights). This duplication of effort and dispersal of what could be critical operations (like remembering to close a file) throughout your code flow is a problem down the line.

I very much know I've forgotten and missed things like this, reducing the possible points of failure for your code is important and my personal preference to not return from a function early is one such methodology.

Besides the duplicated points of failure I also found the code to not be communicating well, its 44 lines of code, better communication comes from code thus:

LoadResult LoadFile(const std::string& p_Path)

{

LoadResult l_result;

if ( !p_Path.empty() )

{

auto l_FileSize(GetFileSize(p_Path));

if ( l_FileSize > 0 )

{

FILE* l_file (fopen(p_Path.c_str(), "r+"));

if ( l_file )

{

Buffer l_temp { l_FileSize, std::make_shared<byte>(l_FileSize) };

if ( l_temp.second )

{

auto l_ReadBytes(

fread(

l_temp.second.get(),

l_FileSize,

l_file));

if ( l_ReadBytes == l_FileSize )

{

l_result.emplace(l_temp);

}

fclose(l_file);

}

return l_result;

}

This time we have 33 lines of code, we can see the stanzas of code indenting into the functionality, at each point we have the same decisions taking us back out to always return. When we've had a successful open of the file we have one (for all following decisions) place where it's closed and ultimately we can identify the success conditions easily.

I've heard this described as forward positive code, you look for success and always cope with failure, whilst the former is only coping with failure as it appears.

I prefer the latter, ultimately it comes down to a personal choice, some folks argue indenting like this is bad, I've yet to hear why if the compiled object code is the same, you are communicating so much more fluently and have less points of possible human error in the code to deal with and maintain.

From the latter we could pick out reused code and we can target logging or performance metrics more directly on specific stanzas within their local scopes. Instead of everything being against the left hand indent line.

Now, I will happily tell you it hasn't been a comfortable ride to my thoughts on this topic, I started programming on a tiny screen (40x20 characters - Atari ST Low Res GEM and when I moved to DOS having 80 character widths I felt spoiled. Now we have tracts of huge screen space, arguing you need to stay on the left is moot, you don't, use your screen, make your code communicate its meaning, use the indents.

And yes, before you ask, I was doing things this way long before I started to use Python.

Wednesday, 24 August 2016

Story Time : Fantastic Rack Mount Mistakes #1

This is the first in a little series, which I'm afraid will have no nice pictures, but trust me, these are all true and if not first hand, at least second hand accounts from trusted sources...

You may or may not have seen a server room in your time, if not then they're basically rooms, generally locked, with at least air conditioning, generally a fire extinguisher nearby and all your company server assets inside.

When I started out in business the server room was a large cupboard with an air conditioner and the 386 based NCR mini-computer inside. It also had set of aluminium shelves, but it had no rack mounting, everything stood on the floor.

This was true for me, for a few years, the IBM AS400's we moved onto were huge machines on the floor, aside from the colour you felt like you were walking into the set of "Flight of the Navigator"

However, I soon moved over to Compaq Proliant machines and these were rack mounted... I walked into the server room, and took a look at the rack, only to be greeted by the UPS battery pack being at the top of the rack... Yes, the batteries, full of acid, were at the top...

I came out and went to speak to the nice chap in charge, pointing out, as gently as I could, that the batteries may leak, or if there was a fire could be ruptured, and it was not unknown for batteries to get hot swell & burst...

He stared at me, blinking, "What's wrong with that, they're in a case".

And he was damn right, they were... a Plastic case, this particular battery pack had no metal sheeting, no casing other made of metal, just metal corner brackets, it was essentially plastic...And full of acid.

In story number 2, you will hear about the server room fire I'd dealt with the previous year. But this chap didn't seem to register a problem with having the battery pack at the top of the rack.

I did however lodge my complaint in the service book, and went about my work. Months passed, and I didn't often go to this site, we moved all the other company assets to rack mounted ethernet switches, a few smaller rack mounted Pentium II based machines and all new desktops.

When one day, a user came to my office door and said they could not get onto the central server, checking my desktop serial terminal sure enough nothing. Checking a physical VT100 nothing... Oh no.

I call the other site, no answer...

I get a secretary to start calling them every five minutes and I start to check back through our logs, it seems from our end the link was active at 6:03am about two hours before business would ramp up during the day, and the last items were a set of data from the design department at a Derby based location.

So I call Derby, asking "Did your upload complete at 6am?" a little bit of silence and I finally get the right chap on the line, he'd come in very early to get a jump on the day and to the best of my recall he said "No, it suddenly stopped and I could not get back on".

I cut him off as the secretary suddenly pages me (yes, rocking the pager) and I pick up my internal phone to hear I've got the guy from the server site on the line, he's coughing... "What happened?" is my instant question.

"Halon was active when I arrived"...

This is pretty bad, the server room was protected by an automatic halon fire extinguisher, if it detects smoke or heat there is a 10 second alert, the door closes and halon is released...

Remember the lab scene in Terminator 2?...

A lot like that!... Staff in the room, or entering, need to use a breathing apparatus, which we'd all been trained on.

This guy was short of breath because he'd been in the halon room.

He says that when he arrived (at 7am) the room was in close down, the building fire alarm was going and a fire engine was already present... The security staff had let them in and they'd used breathing kit to enter the server room, to find.... And this was the best he could describe:

"A smouldering brown mess of plastic encasing the top of the Proliant server, stuck in all the front vents and solid into the quick release drive catches, all the network wires above are only copper and the battery pack is black"

I dive in a company car and drive up there, I arrive as the fire brigade are happy it's electrically isolated, that the halon is clear, we've got a guy from the fire suppression system coming to strut his stuff, but in the mean time the company is offline.

So, I set up a 10Mbit ethernet cable from the main input multiplexer, set up a backup system on an AST dual core machine (literally, it was a PC with two 486 CPU sockets) and 16mb of ram... power!

This was to be the tide over machine, we cycled the nightly backup onto that and at around noon, people were working, if extremely slowly. That machine was our stop gap for the next 48 hours.

So, now to the server room, initially the most obvious problem (because the rack had a glass door closed) was the ceiling and floor... The ceiling tiles were seemingly not flame proof... DOH.... They'd melted at least and there were droplets of melted whatever they were all over the place, blown around by the AC.

The AC itself had valiantly kept fanning the flames, so it had black soot marks neatly blown all down the wall.

And the floor directly around the cabinet base was charred, they were fibre carpet tiles on concrete, but had smouldered...

Our fire plan had fired, and the halon tanks were empty, both had fired, despite perhaps this only needing one... The vent out had worked and we'd used up one of our internal breathing kits, so two spares, one inside one outside.

The room is also electrically dead, nothing is on inside. We had redundant feeds in from sites, one inside the server room and the one outside down the corridor which the AST machine was now connected to.

We're using torches, so I opened the front door of the server rack, and it's just black, the glass is coated in black, the once cream coloured Compaqs are black, the silver rack arms are black, going up the patch panel above is black but you can see the wires look a lot thinner, reaching up they're charred, the plastic has sloughed off of them and later when I have a step ladder I see it's mostly pooled onto the top of the first computer, but it's also seeped into and set within the vents on top and spilled over into the front of the machine.

Above this is the crispy remains of the UPS, there is a hole about 2 by 3 inches around 8 inches from the left of the case bottom, and I can see into the UPS... It's a charred vision of hell.

We take this UPS out of the rack, as it is the ONLY thing in there that had been put in by the local chap, the ONLY things not on the service list, the ONLY thing I had had an issue with.... And we pop the lid, one battery had gone, catastrophically, it had melted acid out, the batteries lay on their side and so the corrosive acid had eaten the rubber off the outside of the wires inside, shorted and an electrical fire had ensued.

The battery acid had then, xenomorph style, eaten out of the bottom of the case and dripped onto the ethernet patch cables, they'd shorted, got hot, dripped more, the acid had also cut through the plastic on the power cables, more electrical shorts... And the gates of hell had opened swallowing the nice working machine.

The clean up was epic, we lifted the servers out to the cleaner area of the IT room, we cleaned the outside and had a pair of Compaq chaps come and clean them... I think they were called Mark and Jeff, and they did an excellent job.

I ordered new cable and thousands of RJ45 connectors and a crimper, setting about rewiring the room, in a better manner.

And I set the guy in charge of the site, Mr "It'll do" to taking out all the racking, jet washing it, putting it back in, redoing the power lines with a guy from the engineering department and I also had him remove all the carpet and the ceiling tiles, and find a supplier for a proper fireproof ceiling.

I also checked with the fire extinguisher guy, and he reckons that we were lucky the room was an hour outside of working hours, because the roof was just ceiling tiles and should never had been fitted with this strange fire system... Despite his being from the same company!... Basically most all the halon had exited via the ceiling tiles!

Two twenty hour days after all this I had one clean Compaq server, one clean rack and enough ethernet connections to put one patch panel and the server back online, this let our users stop moaning about how slow things were.

The other Compaq, the one closest the mess, was going off to be re-cased, the damage was so bad, but the unit itself worked.

We also now had a new fire safe, as the original had proven to not be smoke proof!

And I had a new certification on the walls being fire proof.

It took three months to recover from this, my end report to the owners, was that it'd cost about £8,500. Not a bad deal considering the whole rebuild of the whole room would have been £40K+

I'd instigated better carpet, better ceiling, better procedure, and I had the brand new branded, supported, service contracted UPS in the bottom of the rack.

The final wash down, was the UPS which had failed, was known to be bad, three weeks before it'd gone up, the site manager had had it out and replaced two of the four battery cells within with a pair he'd bought cheap, he'd done this himself, I was not even aware, and he'd had the bottle to charge the company twice for the batteries, pocketing the difference.

Embezzlement, bad attitude, bad procedures and simply his considering me below him, resulted in his dismissal soon after.

He was the first person I enjoyed firing, he was the first person I'd actually had to actively fire from a role for incompetence, and he was the last person I let dictate to me where kit in a rack sat.

All for the placement of a UPS in a rack...

(Paul, Andy, Max if you recognise this story, you know you were a big part of that clear up, but you were in London, Leicester and Morocco... I was up to my arm pits in that mess!)