Showing posts with label lessons. Show all posts
Showing posts with label lessons. Show all posts

Friday, 4 September 2020

Great Rack Mount Mistakes #7

Today's story in the annuls of problems in IT comes from a guest editor... Mr B.... And Mr B (no relation to anyone in other stories given monogram names) works as the sysadmin and developer for the whole set of systems with his employer; unfortunately this means "it's all his fault".

So what went wrong?  Well, over night the site had a power cut and though they have a nice server, they don't have a power back up, so that server went off.

The server is essentially a java host, specifically hosting Tomcat, and it reaches out to connect to a set of third party endpoints via a restful API.

You'd think no big deal, start up get running and keep running, except that third party don't force a disconnect upon a new edition of their interface API, if you're connected to version 1.0 then you can and they will happily leave you connected to version 1.0, even if they release interim updates, add new calls and quite what got Mr B today remove a call or two.

Your session ending, and then I presume all sessions of that old version, would free their server provisioning to de-allocate the old version.  But to force users to migrate upwards in the chain their published API (so think the end point here in whatever flavour you wish) declaration changes.  Such that you re-download it upon re-connection and that's your new flavour of the month API.

The problem?  It didn't work.

So Mr B had to set about debugging this on the fly, in a live environment, which was down.  And he went through the three stages of technological grief....

1) Denial:    "This is completely illogical, my code brings down their interface, which is the only thing we connect to, it must be right, they can't miss-match them, so this must be my side or the gods are against me".

2) Investigation:  "Read the logs, make a change, nothing seems to work, the gods are definitely snickering behind that cloud of steam now".

3) Realisation:   "If it's not me, and it's not the system here, it must be their side, the huge multi-billion international must have published their API spec with a mistake or miss-match.... click.... YOU FUCKERS!"

What was the actual problem?  Well, the third party published API was actually wrong, the downloaded specification still contained several calls which were removed, when the services Mr B had written came up they checked each end point and found several calls defined which did not respond and so his software, correctly, reported that the endpoint was offline.  They were, they didn't exist anymore.

His fix was to literally tell his stuff to ignore the multi-billion dollar international service providers API spec and to "download" a copy which he hosted locally, with his own edits to it.

Now, he's a tiny fish in a huge pond here, even if he reports this miss-match said multi-billion dollar international isn't going to hear him, and by the time he does they maybe several months down the line, and other folks may have spotted this problem.  He maybe listened to, but he essentially doubts his voice would be heard.

The problem of course being how to abate this issue in the future?  How to avoid this stress?  For at one point he did say "the company is done for", because literally everything was offline, all their services were down.... And of course everyone will blame the little guy doing all the IT, they won't think that the multi-billion behemoth entity could possibly publish a wonky API spec, most of those shouting at Mr B with mouths frothing wouldn't even know what he meant when he explained this to them...

The fact that he's identified this issue, resolved it, and everything is back up within two hours won't be remembered, the glass will remain half-empty, and so it'll only be remembered that on the 3rd September 2020 Mr B's IT suite went offline.

Wednesday, 8 November 2017

Software Development - All Areas Stagnation

It has been said by far bigger and better minds than myself, that if you sit still, if you don't continue to learn about new things and innovate you will stagnate.  This has been a huge problem looming within the business I work, certain things have worked since the industry sector was conceived and even though more than half a century has passed it has largely passed the internals of this industry by.

That is until very recently, where market competition has sprung up, the market base itself has reduced and so pressure is on... Nowhere is this more apparent in my industry than on the software, the front-line of pushing product to customers.

The trouble however seems to be that many people have stagnated, they've stuck with the safe option, the tools which work off of the shelf, I am of course talking about Windows, the entire tool chain that is used by 99% of the company is all Windows based, I am the man on the spot waving the Linux flag.

But just a few days ago, the Windows world had to come to my desk and see their future, I had to show technically minded folks around the code of the new system, introduce them to my imposed coding standard and update them from Microsoft Specific Visual C++ thinking to thinking about platform independent Standard C++ code... I had my work cut out for me.  I prepared the cleanest desktop environment I could (i3 on Ubuntu).


I didn't want to startle them, so the editor/environment is Visual Code from Microsoft... They started to look at the system, it's structure, how the code related to the design and the diagrams they already had, we started to follow the process flow diagrams.

It was a success, certainly no-one burst into tears, they saw the kin-ship between this code on Linux and the systems they'd worked with for decades on Windows.

But then, the senior software manager leaned down, peering at the screen, and he said some fateful words...

"I've never seen that before".

Is he talking about some piece of C++14 or C++17, the lambda's, the auto's, the shared_ptr... What technical bolt has he not screwed his nut around?


"That's very good, you can see the whole code layout.  I've never seen that before, who did you say wrote this tool?... Really Microsoft, I've never ever seen that before".

This chap uses Sublime, I've seen him using Sublime... Which does exactly the same thing....


What is the lesson to be learned? When we're talking about stagnation in software we are not only talking about the language, but also the tools, and then not only the IDE, the whole environment.

Certainly I was introducing Windows users to Linux, and even then on an unusual minimalist desktop manager, but still the lack of connection between a tool I've seen people already using and what it was capable of demonstrated tools are not being leveraged to their full potential... Certainly learn your new languages, learn your language updates, but keep your tools and environment up to spec too...