I need to anonymize this code, so we'll be doing it in a pseudo C# style. So one of the last tasks I had at my prior employer was to inherit the entire code base for a project I had been bitting and bobbing in for years, I'd seen this project start, release (many times), mutate and ultimately age.
As I took control it needed replacing, which is a whole other story involving C++ and dragging people kicking and screaming into touch.
This product though was like your grandad, it sat quietly on its own sucking a Worther's original waiting for a war film or Columbo to come on the tele.
The difficulty was the fault rate, between 9 and 14%, of machines were off in the morning, if a pack of updates were ever sent (for content) then that was around 46%... Image the calls there, the service manager and his oppo having to field 46% fault rate because of your update. Indeed on one occasion I remember driving to a customers site and physically handing them a good update DVD rather than our leaving them to wait.
So what was so bad? Well, it all came down to.... Lets look at a piece of code that is seared in my memory:
FileStream file = new FileStream("C:\\SomeFile.txt", FileMode.Open, FileAccess.Read, FileShare.None);byte[] buffer = new byte[file.Length];int bytesRead = file.Read(buffer, 0, (int)file.Length);file.Close();// Do something with buffer to give us a new bufferint newDataLength = 64;byte[] newBuffer = new byte[buffer.Length + newDataLength];file = new FileStream("C:\\SomeFile.txt", FileMode.OpenOrCreate, FileAccess.Write, FileShare.None);file.Write(newBuffer, 0, newBuffer.Length);file.Close();This is part of an update sequence, where the existing file would be opened, the new update delta calculated and it intended to append it onto the end of the file, and this was fine for years, it worked, it got shipped. It went wrong about five years later, can you see how maybe?
A hint is that this was a 32bit machine.
Did you spot it?.... it's line 2...
"file.Length" returns a long, but then all the following file operations work on int. The file started to go wrong after it was two gigabytes in size, because the range of int being 2,147,483,647 if we divide by 1024 three times we get kilobytes, then megabytes, then gigabytes and we see this is roughly 1.99 gigabytes.
But then think about that, this is a 2GIGABYTE file being opened in a buffer in RAM!?!?!?
It just makes a pure RAM copy of itself, then opens the file and starts to write over the original from zero to the end.
YEAH, so it's over writing the whole original file.
It's so wrong in so many ways, the massive buffer, the overwriting of existing data already safe on disk, the fact that this all took time too... this operation happened at a reconcile phase, it was all asynchronous, whilst this system portion was doing this mental tossing about another part of the system had changed the screen... to say "Please Power off or Reboot".
So people did, they literally pulled the power. So they lost their 2gigabytes+ of data, and when these files were getting large they were nuking them by pulling power too!
The solution is simple, open the file for append, or just seek to the end and add the new data on.
int newDataLength = 64;byte[] buffer = new byte[newDataLength];// Get the new data into the bufferFileStream file = new FileStream("C:\\SomeFile.txt", FileMode.OpenOrCreate, FileAccess.Write, FileShare.None);file.Seek(file.Length, SeekOrigin.Begin);file.Write(buffer, 0, buffer.Length);file.Close();This was only part of the problem, the functions using the data from this file took it as a whole byte array, it literally had no way to chunk the file. I can't go into the details, but I had to break that up and start to stream the data through that system which then let me add the resulting new delta array (which was always smaller than 2MB) to the end of the file.
That was only one part of the system which kept be awake, another good one, used a lot was a pattern to also overwrite small files, mostly the json files which controlled the settings. So the users would often turn these machines off by simply pulling the power out of the back.
Whenever it was saving a file it would basically be doing:
File.WriteAllBytes(thePath, allTheBytes).
Yep, it'd just write over the file.
My fix? Simple, when opening the file at a time when we didn't expect the users to just pull the power - or at least it being less common - make a file back up "File.Copy(source, dest)" and these destination files were numbered 1, 2, 3 which we could configure... so sites where we knew they had a high fault rate we could stack up 5 or 7 backups of these files. but machines with a better hardware, or SSD's we'd only need 3.
I don't even think the service manager knew about this "fix".
But armed with these backups we could then leave the original code alone (which was quite convoluted and I didn't want to fix to be honest) but then on next load if the opening failed I'd have it nuke the back up it just took, then use the last best aged backup. And if now there were more backups then we should have we'd delete the oldest.
Settings didn't change very often, but this did let us solve this issue.
The final worst piece of this system was the licensing system, which used a USB connected smart card reader, and a custom decrementing secure card format to license the machine time. This was fine for years, it used a nice Gemalto reader and cards, and all was fine in testing.
The machine tested the card whilst in operation once every five minutes, so no big deal. When in service mode it checked the card every 10 seconds to update the license level display, but the service mode was never intended to be left for more than a few minutes.... So what happened?
Yeah, a customer opened a machine and left it open for a week.... And their machine went out of operation, when we got this particular machine back I just opened the door, took the card out and pointed to the literally charred burned back of the smart card chip... It was a white plastic card, and the back was deformed and light brown... I did chuckle, sucked for the customer, but we never worked out why they had the door open in service mode for so long; they weren't meant to.
But worse that that isolated incidence was a new tranche of machines being released in 2015, suddenly all had faults, there was machines out of order, machines not allowing play, machines rebooting... Nothing seemed to clear them, and some were reporting "Out of Licensing"; despite people having paid for brand new cards.
They were issued a new card... The old cards came back, were reworked... so randomly once working sites got either a new card or a reconditioned card from any other random site.
New machines had a new brand of card reader, old machines had the Gemalto. New cards were all these new brand of card, and the old cards were the white gemalto ones... this mix just went on... and soon we had a rising fault rate.
The diagnostic view was at first a little mixed, sometimes a new reader was fine, sometimes a new reader was bad... all customers reported "my new card", they had no idea that the brand had changed under the hood... and in fact nor did I.
You see to save a few pence per card (12p per card to be precise) they hadn't gone with the grand 34p GemAlto cards, they'd gone with 22p Chinese copies... Inferior copies as it turned out, they had around 1/8th the life span, so over time ALL these "new" cards failed.
But then, in the GemAlto reader they were all fine... So the new reader?... Oh that was ALSO a cheap Chinese knock off, and these things had strange problems, I suspected sometimes they were putting the full 5V USB current through the cards (rated at 3v) killing them. And was proven right.
This unholy quartet of product caused havok, but I eventually found that new readers could kill either new or old cards, they had to be recalled... Then new Cards could die randomly in even old reliable readers, they had to be recalled. Which means we slowly struggled to find old readers and old cards.All of this was a purchasing foul up, unfortunately managers saw it as an engineering problem and so one had to code around poor hardware.The first thing we did was add two toggles, one for "old card" which I could detect from the card chip type being read on reader access. This slowed the reading of the card down... form 5 minutes to every 30 minutes, so we ricked giving customers longer before an unlicensed machine went out of action, but it was accepted to give us a much longer read life for the card cell.Then we deferred the first read of the card, on boot up we literally leave the USB device completely alone, let windows start and everything settle on the desktop driven system. And after 5 minutes we'd start our licensing check. It was accepted that a user could technically receive 4m59s of unlicensed use and then reboot to get more time, but that would be a little impractical in this usage scenario.Doing these two things we could just about use the new readers...But the new cards were just so utterly terrible, we did eventually have to buy better cards. I never heard if there was a refund on the originals, but I can assure you my time along cost more than the £120 they saved going with these cheap cards.
A blog about my rantings, including Games, Game Development, Gaming, Consoles, PC Gaming, Role Playing Games, People, Gaming tips & cheats, Game Programming and a plethora of other stuff.
Showing posts with label files. Show all posts
Showing posts with label files. Show all posts
Saturday, 24 April 2021
Bad Files and Smart Cards in a Project from Long Ago
Tuesday, 13 February 2018
Windows : Com File Effect
I'll never claim to know everything about Linux nor Windows, I know a lot more about the former than the latter, mainly as the Windows is a closed source product. As such it doesn't get my attention as much, unless something is going wrong.... Today I have found an effect I can't explain (at least not easily) in windows, clearly it's some sort of reserved file name or type effect.... Maybe you can explain this to me.
So, open any folder, right click and go to create a new text file...
Once you have the file...
Change the name to "COM" and a port number, so "COM1" in my case.... It can be any COM port name, even if you don't have that port installed or active on your machine....
Once you complete the name, or change focus of the text entry (by say taking a screen shot - not actually committing the new file to disk) you get this strange message...
"The Specified device name is invalid"....
What device? I'm trying to create a filename. I was immediately a little puzzled, I've used Windows since Windows 2.0 I've never ran into this issue.
Googling around I find a chap on social.technet.microsoft.com saying not to use a whole plethora of names for files or folder as they're reserved, I have never heard of this.
The official Microsoft advice likewise says not to use these values, and it kindled a memory in me from programming in DOS 6 back in the mid 90's, one would in Pascal, you could "println(PRN, 'some message');" and see the message print out on the printer.
I find it amazingly odd that these files are still reserved in 2018, they're proper nouns, they're things you may want to name files, and they have a FULL Path, which makes them very distinct than the COM port device. Its odd, trust me.
But the oddest thing is, what does Windows do, what is the piece of code picked up and made to run that message box upon creation of the file? And I've got to wonder how exploitable that maybe in local malware to open lots of message boxes and annoy the user.
Labels:
COM,
explorer,
File SYstem,
files,
LPT,
Microsoft,
old,
out dated,
ports,
PRN,
programming,
reserved,
serial,
silly,
the specified device name is invalid,
Windows,
windows 10,
windows 7
Thursday, 12 May 2016
Failed Windows versus Linux Rant
I'll be honest, I was coming to write to you all today to give windows another well-deserved roasting. Unfortunately, I found the same issue in Linux. Well, when I say issue, perhaps this is my own expecting something more logical from a system.
Lets take a look at the folder I'm talking about:
It's any folder you like, within it, I've created a folder called ".example", another without the leading period "example" and then a file which includes the word "example" within it's file name.
When I go to search this directory, with the explorer search bar within the window presented, I expect it to work by using exactly what I type...
So, if I search for "txt", only the "example.txt" file should come up, and it does...
But if I search for "example", I'm getting three results, as they contain the word, therefore I aim to be more specific and search for ".example"... The result of this last search:
Yup, I still get all three results. Clearly only one item matches ".example", and this annoys me.
And I jumped to Linux to take screen shots of this not working this way, but I found ls does work this way too... To be more specific I had to use "find":
This annoys me, I want the search to be specific, unless I add wildcards like "?" or "*" then I want what I type to be what it searches for, and if they have to put dumbed down searches in there, make it obvious that what you're getting is not an exact match.
The reason?... Imagine you have 300,000 files and you select 2,49 of then based on an exact filename search, and you give it a cursory glance, assume it to be correct, select all and delete?!?!?!
Yes, you end up deleting a load of stuff you didn't intend.
And I know, I know, I assumed what it was doing, and I know assumption is the mother of all fuck up, but still....
Even more annoying however, whilst setting up this example on windows, I did find something minor to moan about:
You can't create a folder which starts with a period in Explorer...
But you can with DOS/Command Prompt!
Subscribe to:
Posts (Atom)