Friday, 5 June 2020

Worst Idea in Development Ever

I'm casting back a long long way, over 15 years to be imprecise as I can, and talk about the worse idea in software development ever.... To perform a group code review once a week.

But before I can do that I need to talk about the good way to do this, code reviews should be non-intrusive for the majority of team members, they should be able to dip in and out, they should fit it into pieces of their day which are spare or otherwise dead time, and it should be use as a tool to not only review the code but introduce familiarity to other people's work & progress to everyone else.

So what was wrong 15 years ago?  Well, the important point about the good way of reviewing code was to actually be able to see the code.... When this new review process was introduced all those years ago... the code was on a machine, whilst we were all in a meeting room and the programmer stood up and verbally described the code to us.

Yes, verbally describing what took all WEEK to do, neither conveyed to us what was being done not shows us how, we were unable to help debug or sense check the logic, and it gave zero oversight into how the code was built, was it efficient for example (we'll come back to that topic in a moment).

And of course the planned one hour meeting was excruciating to be in... and only one developer got to speak in the first session.

There were 6 staff programmers and we were augmented with 6 contract programmers, the "lead" if you could call her that programmer went first... and her presentation was on the hardware to database connection.

It used a Marathon database, which I've just googled and humorously I can't find any references to that product.  And the "lead" had been using it at her prior role on Java based programming, here in C we were a little perplexed by her approach.

For she presented that from the high-speed buttons and lights on the hardware, which had to flash fast and had to be responsive to touch, she had written a system which took the impulses and inserted the values into a "Buttons" database table.  And then another on a loop which polled "SELECT * FROM LIGHTS;" and pushed the "on" or "off" states of the resulting list (from left to right) to the hardware.

Of course was immediately obvious to all but the lead, the manager and her office buddy, would not fly and the three different members of the team, all infinitely more familiar with the hardware and the C language than this lead, pointed out that it would be slow and unresponsive.   That we were on a single core machine so you couldn't just ignore that the database driver would wonder off and do it's own thing, as well as be constantly interrupted by the processing of our system, and the system itself was to be host to third party software which we could not ourselves fully control, so they might run with a tight loop starving our system and especially starving (what was then) a low priority database driver service.

All in all it sounded a REALLY bad idea.

But, she'd already done the work.  The manager had already signed off on her idea in a series of closed doors conferences with her, and her lone office buddy made it clear he was on side.

It was 3 staffers verses 3 "senior" staffers.

It was fact, this was going to happen.


Which brings us to an obvious fault in this "Code Review" no-one could actually get anything changed, we were really there to be told of the fabulous innovation from behind the green curtain.  We could not divert, advise nor simply appreciate the good, bad or ugly in this system.

So it was revision one (of twelve) was issued and immediately a real piece of content (a game) was played on the system... This game already worked on the old system we were replacing, so its behaviour was well established.  And immediately all the lights started to behave badly, they would he updating visually, we could see, from left to right... rather than all coming on together and going off together.

This was puzzling the micro-controller sending the light signals was a 32bit register, you just set the binary representation, and the micro almost immediately lit or extinguished the bulb... Why was it showing this left to right behaviour?

The three of us whom had descented said "are you going through that lights query as a list and applying it left to right?".... "No" came the reply, but she soon disappeared and came back with version 2.

Which STILL turned the lights on left to right, but subtly so, we could see it and the experienced play tested protested it was, but to the uninitiated it was just about passable.  So it went into test proper.

And they reported the same, but opposite feel to the buttons, that they lagged from right to left, and lag they did.

To her credit the "lead" didn't dismiss them and went to actually play the machine, yes she'd not played it until now... and immediately version 3 was issued.

Again, as with the lights, the lag was still there, the left hand most button was the most laggy, and that would make sense if there was a loop which was working through their being updated from right to left for the buttons and left to right for the lights.

But she insisted that there was no loop.

And we had no oversight on the code, even though we had "reviewed" it.

The next morning we walked into find her already present, version 4 had gone down to test.  The tester had put a different game on, which did a different series of light patterns and reported "It doesn't look right at all" (except he'd not been as polite) spurring the new version.

One of the more vocal of our group went to the manager and said "this going through a database is the madness, no matter what she's actually doing in the code, it can't be fast enough to feel like the old direct to hardware system, we need to remove this layer and just go direct".... but the manager didn't want to upset the "lead" so persisted with the database thing.

Release 5, 6 and 7 went down later that day, then about lunch a machine was hoisted up the fire escape stairs and into the office wheeled all the way and plonked next to the lead, she was to debug this properly.

Release 8, 9 ad 10 went only to this machine next to her and the tester was duly dragged up and into the room to sit, painfully, next to her explaining how it should plan and indeed he insisted the old system be wheeled up and plonked right next to the new borked on.

The two boxes side by side it was immediately obvious the new database driven system was really Really REALLY slow compared to the old one.

"It's the database" voices said.


"It's the loop" others called out.



"There is no loop, and the database is fast!" She persisted.


None of us from the C side bought this, and still could not see the code (code revision control is a topic for another day).

Finally, it was Friday, version 10 had remained for all of Thursday, but Friday came and something had to be done.  The engineer responsible for the board, the actual micro-controller was brought in, and he immediately ran his test routine on the board showing the lagging and we saw it being solid behave correctly and work, despite the lass having decried her board was faulty.

The hardware engineer had a view none of us got, he could see the code, he sat beside her and with us all gathered around he blinked, looked at each of us and then looked at the code, then turned to the "lead" and said "you know you can just sent the unsigned 32 bit integer directly to that register, all in one go?"

She looked at him and asked what he meant.

"Well if you want to turn it all on, send zero"... if you want to turn them all on send 4,294,967,295‬"

She blinked and looked at the code and then back at him, "I don't follow".

The hardware engineer looked at all our eager faces, three of us knew what was coming.


"Well, you don't need to loop through each boolean and bit shift them one at a time into your register sending them one at a time, you can still loop and bit shift them all in if you like, but you can sent the command once, set the lights to the value in the register once".


Three of us left the room, the manager looked dumbfounded and a little thunder struck, her office pal was notably quiet.

Version 11 came out about an hour later and the hardware engineer went to leave, popping his head into the office of the 3 of us on the C side... "Why wasn't this picked up in review?"




* Throughout you've seen me put "lead" in air quotes, this is intentional, the lass was not a lead, not as we'd describe today, she was just the loudest voice in the split office, she was furthest from the manager and had this knack of making you feel like she had ALWAYS been there and always been in charge.

I strongly believed she had been at the company for years when I started.

It wasn't until years later that I discovered she had only been in her role two weeks longer than me, two weeks in which she had moved from direct hardware peek & poke to via database.


* Version 12 : Yes you guessed it, version 12 was to remove the database from the hardware interactions completely, an abstract layer was built holding a representation of the hardware in memory and that representation was twiddled from the database side and it's representation imposed on the hardware by directly poking.

No comments:

Post a Comment