The Megalomaniac Bore: Image Processing

Showing posts with label Image Processing. Show all posts

Monday, 2 February 2015

General Update

I've been a busy boy this weekend, I've ordered some new network wire to bring Cat 6 Ethernet directly to the upstairs bedrooms, so I have that to fit out, I've organised all my finances after the house move and everything.

I've retired my two Dell Power Edge servers - just because they were so loud - and I've got a whole bunch of pictures of them for their going on e-bay later this week - they'll be sold as a batch lot, as I basically use one as parts for the other, but they're not shabby, the heavily used on is is ironically one of the cleanest machines I have.

Their replacement is a much more suitable Pentium 4 PC - my old World of Warcraft & Eve-Online machine in fact, it's a massive downgrade in power - we're going from dual 64bit Xeon's on a server chassis to a 32bit 2001 Pentium 4. But it's all about getting away with leaving the PC running in the front room, where it'll be part of my internet provision of services.

I've also been servicing and modding my Saitek X45 joystick, I've got a friend of a friend sorting out a set of new springs for me, I've looked carefully at modding the handle, and also had the thing in pieces to discover what's inside. I maybe putting together a whole post on this alone, as there are not many or much good information about this solid old stick on the interwebs. Mine being 12 years old it's lasted a long time with no intervention. It's played me through MS Combat Flight Simulator, the original IL2, IL2 Forgotten Battles and IL2 1946.

It started to get a little funky in War Thunder last month, I noted I could not keep my 109's from pulling oddly, I thought it was a trimming issue, but it was actually the lack in my spring.

Anyway, as I say, more about all that later.

In Elite-Dangerous, I've been out and about in the universe, I've taken my pilgrimage to Sol, I arrived there last night, and docked at Abraham Lincolm station just as Australia was coming into view below. I'm going to take a better look tonight. However, on the way, though I have a fuel scoop I did decide to stop off and dock at a station, just to update my save at that point. The Commodity reading software worked a treat, even on the system name it had never been trained on before. Going forward I need to improve the capturing sequence, as it's a little clunky even for me; and I created it; but data capture is working out really well now. Capturing and the reading of the commodities, with my myriad of little updates and string swaps is working extremely well.

Aside from changing the capturing to be more, shall we say, slick; I'm also going to be adding threaded operation to the OCR, so as I can leverage some of the multi-core features on machines to speed up the reading process. There is a notable delay at the moment between the images being captured and the OCR kicking in. Yes, part of this is writing the images as bitmaps to disk, but the major delay is the serial nature of it going down the list, I may as well go down two different parts of the lists at the same time!

Apart form selling the Dell servers, I'm also going to be selling a really nice 1024x768 resolution CRT monitor, my last CRT monitor. I'm only selling it because it's being retired along with the servers, it was their heads up display. But it's one of those crystal clear displays perfect for coding with.

I'm also going to be selling the Samsung CLP-320 laser printer, because... well, despite being only 4 years old it's not survived, it still works, but there are fading in and out on the printing, I've replaced the toners (at a hell of a cost) and so I can only conclude it needs some tinkering with or servicing, which I don't want to do. So I'll go out the door for spares & repairs on ebay.

If I get a decent price for all these bits of stuff I'm selling, I may be looking to get myself the £160 X55 Rhino flight stick, but this is perhaps a pipe-dream with the financial situation as it is.

Monday, 26 January 2015

Elite Dangerous - Trading Tools - OCR - Progress Pt7

Help Sponsor me and this project now! Click here for details:

Over the weekend I've been working on the collection of data, so I've been to a few more systems and stations, I've been checking the algorithm for finding the best route, and I've been giving myself some feedback...

The first two things I immediately wanted was the program to remember the last folder used, and also as I captured data, to be able to refresh the loaded information without having to close & open the tool again. Basic stuff.

So, loading the data looks like this now:

Lets say we've got courier missions from LHS 1914 to FK5 2550, we select the start system, so LHS 1914:

This instantly populates the Stations and total list of commodities in columns 2 and 3 there. We can select the specific station we're docked in as well:

Reducing the number of commodities, now we could just select Fk5 2550 from the destination system list, just to see what would bring us the most profit in running into that system:

The answer here is then given in the top right box, that result will stay there now until we change system or destination system, we can get the same search just without the station select from the second column also, but then the search would start from all stations in the source system on the left.

This search alone wasn't enough however, because I needed to be able to quickly find a supply of certain commodities, hence the new controls in the right bottom.

Lets say we want to find Beryllium for a specific mission goal, we can select that commodity by name from the drop down:

The results are the source of Beryllium from the database in best price (for purchase) order. Obviously, if you've selected a commodity you've never seen before then you get no reply, unfortunately in the current context there is no distance, or connectivity in the database, so we don't know if these items are close or not, that is still a player decision*.

The other feature added there is the "Demand For" search, this simply lists those commodities in the database which are wanted somewhere, the results are given in best price (for sale) order, this allows you to quickly marry up any profit (if any) there maybe to be had, I find it most useful if the upper algorithm driven box comes up with an illegal selection.

* However, I do envisage changing the data collection so as data is captured from system to system we have a basic web of where it connected to where by input, this basic webbing could then be used to add "how many jumps" a commodity is from the selected source or destination systems. However, to add all this functionality I need your help!

Wednesday, 21 January 2015

Elite Dangerous - Trading Tools - OCR - Progress Pt6

Update: You can now support this project at Patreon!

Armed with some of my data I've had two stabs at making use of the data, and so here's the first peek at the tool I've thrown together to let me make use of the information, one simply selects the start system and then the destination system and the best commodity (for profit) is calculated.

You can also refine the start point by station or specifically a commodity, before you select the destination to give a specific start point and be given the best profit item.

The next feature I'm going to add is a box to enter the name of a commodity and find where it's sold as a list of system & station names, just as a helper for what's already in the game's galaxy map.

Sunday, 18 January 2015

Elite Dangerous - Trading Tools - OCR - Progress Pt5

Update: You can now support this project at Patreon!

So, last night I had my first long run and tuning up session, capturing system name, station name and commodities as I played the game... As you can see, I have quite a few captured commodities now:

I'm now going to throw together a separate reader, to load the data set and let me do comparisons or round-trip plotting of best prices.

But, I thought i'd explain how this application works, as I've posted it into the streams of a few Twitch chaps playing Elite, and everyone seems to have their own preconceptions....

1) This is a stand alone application, it runs in windows completely separately to Elite.

2) It captures the screen images, it DOES NOT open the processes memory.

3) The areas of the screens captured are filtered, and then passed to an optical character recognition (OCR) system.

4) The OCR system I'm using is Tesseract

5) The image processing is triggered by keys: In game you just play, and then Press F1 whilst in the navigation menu, and the program will take a shot and try to work out the system.

6) Press F2 whilst in the commodities screen, and the station name is captured.

7) Press F3 and this toggles on the capturing of the area the commodities are listed in, it takes pictures until you press F3 again, so you slowly scroll down the list, then press F3 and the program passes the images taken for processing.

8) The data is written out as XML and contains the System/Station name, then the Commodity Name, the buy and sell prices, and then the supply & demand levels.

I have had to do some manual fixing of things, such as correcting for the OCR getting the strange minerals spelt wrong.

My plan once I have the data coming in, is to have it upload to a server, which I can then view from anywhere and use to make searches of the commodities.

I would however like some help completing the application, specifically I require testers, if you are interested, please contact me in the comments below.

Finally, lots of people ask "Why", why make this tool, why not use one of the others out there... Well, I have nothing against the others, I looked at a couple, including EliteOCR and they showed that Tesseract could get the job done, but none of those tools were mine, I've played and used a lot of tools for game, I'm a professional programmer, so I figured I should make a tool which is exactly what I want.

Will I never make this publically available?... Perhaps, but I've had little to no interest thus far.

Saturday, 17 January 2015

Elite Dangerous - Trading Tools - OCR - Progress Pt4

Update: You can now support this project at Patreon!

Just an quick update, since this morning, I've been tidying the code, fixing things up to make it work with the actual client, and here's the first basic run....

This is a looping console application, so we press F1 when looking at the navigation console to grab and read the system name, and when docked with the commodities window open we press F2 to grab the station name.

Pressing F11 gives the status of these two strings, and you can press Y/N to confirm the selected names before going any further.

This is the important header information, captured live from the client, the client again can be any resolution, windowed or not, you don't need to do anything special for this, just look at the right pages in the client before pressing the F1/2 button respectively, you keep the focus in your client (so you can be full screen) and my OCR application, in the background, works seamlessly.

Now, the OCR is not quick, nor is it perfect, but here we can see it working, and for my needs this is just the ticket.

Next, the actual commodities, which I'm going to save as XML files all over... And then I need to throw up a reader to take those XML files in and make a database of searchable ccommodities The main search being you select a commodity you see on screen in large numbers, and the tool will get all the buying stations up in order of the buying price, optimizing that trade for you.

The second search I'd like is to give it say 4 systems, which I see in the client are linked & in jump range even when full, and the tool work out the best trades (top 3 maybe in case there are few of the items) to take between them in a sequence.

Elite Dangerous - Trading Tools - OCR - Progress Pt3

Update: You can now support this project at Patreon!

Having played with my own OCR, I've decided to use Tesseract, however, I'm doing it in an unusual manner. Because this is all really a learning exercide for me, and to keep the build small, I've put the tesseract API calls into their own EXE, not a DLL, an EXE I can call giving it an image filename and an output text path.

The output is XML, tells me the result and likely text result, handling errors with the image or training data for me and not crashing or causing issues with my application.

This clear division of effort lets me just use the tesseract api example (lepttest) code to do my OCR, whilst I get the skimming of the Commodities actually working.

Then I can revisit the OCR later, or maybe play with my own....

Here's a screen shot of the application "reading" from the screen...

As you can see, this is taking a screen capture of the area of the commodities window live from the client, and it's just updating any commodities it see's in the list...

My capturer also works at any resolution, so you can be fullscreen, windowed, or any resolution in the game, there's no need to be a specific resolution as some other tools require of the Elite client.

I am however struggling with the OCR, I think I need to retrain, or better train, Tesseract for the font used in Elite, but that's a separate problem than my actual tool program, as I've split the OCR out!

Thursday, 8 January 2015

Programming - Elite Dangerous Tools - OCR

Update: You can now support this project at Patreon!

Update: Check January 2015 for a series of updates and information about this project - actually working and operating.

With our now stitched image, I've performed some research into OCR, online one can post the image below to sites and get OCR reading out of it... Here's the image, and a screen shot of what notepad contains once I open the resulting text.....

Yeah, not very pretty is it... Clearly the colours in the shot affect things.

So I put the image through a new custom filter, taking bright orange in the pixels and making them white but everything else black, this two tone image looks like this...

And it's OCR looks far more useful...

Now I need my own OCR... Hmmm... Lets have some research time.

Wednesday, 7 January 2015

Programming - Elite Dangerous Tools - Market Data

Update: You can now support this project at Patreon!

I've been working on the image stitching & capturing, especially monitoring the keyboard from my program, whilst in the game. So now I can capture a region of the game client window on command.

Once I've captured the different parts of the Market Data (Commodities) screen.... I can stitch them together.

Lets take a look at the pieces and then the result....

Now, each piece is manually captured upon a key press, and then once I know I have enough pieces I manually kick off the stitch... This is the result.

I can now at least store this as the market data for that station, however, this is not perfect. I do have to ensure I don't highlight anything and I do scroll down manually and take each image.

The next step is going to be to name this resultant image for the source station, date & time it, and then process it further, which takes us back to the Grey Scaling and Sobel processing, finally to pass them into OpenCV or whatever I end up with to optically process the data from the image.

Here are some other pages captured, which have been put together from various stations...

Friday, 12 December 2014

Programming - Elite Dangerous Tools - Sobel Edge Detection (with Full Code)

Update: You can now support this project at Patreon!

Forearmed with our greyscale code we can therefore look at the Sobel mask to give us a nice crisp delimitation between the background and the text on our screen shots... Lets say we capture the top half of the market prices screen upon landing... We can capture the screen shot, convert to grey scale and then sun the sobel mask over the pixels to give a crisper showing of the text for our later OCR work...

There are many explanations of how Sobel Operations work, however they are very mathematical... Simply put we have each pixel (except the ones in the edge) value and for it and each neighbour we apply a mask... Lets draw some simple pictures....

Below we see the grid of pixels in our image...

Each pixel is made up of a value for Red, Green and Blue, but because the image is greyscale all three of these values are the same number, so we can just take one of them...

Next in code we need two masks, these are 3 x 3 grids of numbers, these values in the grid are applied to each pixel, except the edge pixels in our image.

So, from 1 until 1 less than the width, from 1 until 1 less than the height... Sliding the middle of the mask over the pixel target we then apply the mask to each pixel value around and including the pixel.

So we calculate and sum these values writing them into the resulting same pixel location on the end result image...

Saving the image result we see the edges of the shapes within highlighted...

Find the full code here and the previous step code for greyscale here.

Thursday, 11 December 2014

Programming - Elite Dangerous Tools - Greyscale (with Full Code)

Update: You can now support this project at Patreon!

One of the problems in our tools to capture market data and read the data optically is to take screenshots from the screen, or be provided screenshots by the client itself, load them and convert them to greyscale.

Now, my preferred method of converting to greyscale is going to use CImg to load the image into memory.

I always load as "unsigned characters" (raw bytes) of data, and I'm going to assume we've got the file on disk as a bitmap - just to save us the complexity of Jpeg or PNG loading for now - so on disk we have an image which is a colour Bitmap.

To load this into CImg we want to create a "cimg_library::CImg<unsigned char>" instance, so the first thing I'm going to do is typedef this as our "Bitmap" type:

typedef cimg_library::CImg<unsigned char> Bitmap;

With this type we then need a helper function, which gives us a loaded image:

Bitmap* LoadImage(const std::string& p_Filename)

{

return new Bitmap(p_Filename.c_str());

}

Of course there are already classes called "Bitmap" in the C++ space, so for safety I'll wrap everything in the final code in a namespace of "Xelous", watch out for that later!

Now, with our image in hand we need to understand how CImg lets us parse over the individual pixels, well the CImg class provides us with a function called "data" which takes the x, y and z of the pixel (z being the layer, but I always use 0) and then the channel as an integer.

Our channels are:

enum ColourChannels

{

Red,

Green,

Blue

};

So red is zero, green is one and blue is two. Taking a look at code to just output the RGB of each pixel therefore looks something like this:

#include <iostream>

#include <string>

#include "CImg.h"

typedef cimg_library::CImg<unsigned char> Bitmap;

Bitmap* LoadImage(const std::string& p_Filename)

{

return new Bitmap(p_Filename.c_str());

}

enum ColourChannels

{

Red,

Green,

Blue

};

int main()

{

Bitmap* image = LoadImage ("C:\\Images\\Image1.bmp");

if (image != nullptr)

{

for (int y = 0; y < image->height(); ++y)

{

for (int x = 0; x < image->width(); ++x)

{

unsigned char l_r = *image->data(x, y, 0, ColourChannels::Red);

unsigned char l_g = *image->data(x, y, 0, ColourChannels::Green);

unsigned char l_b = *image->data(x, y, 0, ColourChannels::Blue);

int l_redValue = static_cast<int>(l_r);

int l_greenValue = static_cast<int>(l_g);

int l_blueValue = static_cast<int>(l_b);

std::cout << "Pos (" << x << ", " << y << ") =";

std::cout << "[" << l_redValue << ", " << l_greenValue << ", " << l_blueValue << "]" << std::endl;

}

delete image;

}

The output from this program looks something like this (assuming you have "C:\Images\Image1.bmp" present)...

As you can see each pixel position is slowly listing out it's colours, this is a really slow boring program, but it opens up to us the inside of the bitmap, we could for example swap every pixel of one colour for another... and save the image again...

What does our original image look like?...

Lets just set every pixel to black, or zero...

int main()

{

Bitmap* image = LoadImage("C:\\Images\\Image1.bmp");

if (image != nullptr)

{

for (int y = 0; y < image->height(); ++y)

{

for (int x = 0; x < image->width(); ++x)

{

unsigned char l_r = *image->data(x, y, 0, ColourChannels::Red);

unsigned char l_g = *image->data(x, y, 0, ColourChannels::Green);

unsigned char l_b = *image->data(x, y, 0, ColourChannels::Blue);

int l_redValue = static_cast<int>(l_r);

int l_greenValue = static_cast<int>(l_g);

int l_blueValue = static_cast<int>(l_b);

int l_n = 0; // Pick a colour

*image->data(x, y, 0, ColourChannels::Red) = static_cast<unsigned char>(l_n);

*image->data(x, y, 0, ColourChannels::Green) = static_cast<unsigned char>(l_n);

*image->data(x, y, 0, ColourChannels::Blue) = static_cast<unsigned char>(l_n);

}

image->save("C:\\Images\\Image2.bmp");

delete image;

}

The new parts here are of course the RGB being set back into the pixel and the image then being saved.

The output is a very boring looking image...

So now what kind of things can we do to the value of "int l_n" here to give us grey scale?

Well, there are a few different algorithms and other bloggers do a better job than I will at explaining them...

http://www.johndcook.com/blog/2009/08/24/algorithms-convert-color-grayscale/

The simplest is to add the red, green and blue values together then divide them by three to give us the average colour channel value, assigning this back to all three colour channels we create a shade of grey...

This code however looks like this:

int main()

{

Bitmap* image = LoadImage("C:\\Images\\Image1.bmp");

if (image != nullptr)

{

for (int y = 0; y < image->height(); ++y)

{

for (int x = 0; x < image->width(); ++x)

{

unsigned char l_r = *image->data(x, y, 0, ColourChannels::Red);

unsigned char l_g = *image->data(x, y, 0, ColourChannels::Green);

unsigned char l_b = *image->data(x, y, 0, ColourChannels::Blue);

int l_redValue = static_cast<int>(l_r);

int l_greenValue = static_cast<int>(l_g);

int l_blueValue = static_cast<int>(l_b);

// Average

int l_Sum = l_redValue + l_greenValue + l_blueValue;

double l_Average = l_Sum / 3.0;

int l_n = static_cast<int>(std::round(l_Average));

*image->data(x, y, 0, ColourChannels::Red) = static_cast<unsigned char>(l_n);

*image->data(x, y, 0, ColourChannels::Green) = static_cast<unsigned char>(l_n);

*image->data(x, y, 0, ColourChannels::Blue) = static_cast<unsigned char>(l_n);

}

image->save("C:\\Images\\Image2.bmp");

delete image;

}

This is the average algorithm... And we could have placed this code into a function to use, taking in the three unsigned chars and giving us whatever value... Implementing functions for the other greyscale algorithms as we go along... My implementation goes like this, adding the algorithms as a selection:

///<Summary>

/// The types of Greyscaling algorithms

///</Summary>

enum GreyscaleAlgorithms

{

Average,

Lightness,

Luminosity

};

Our function can then look like this:

void ConvertToGreyscale(Bitmap* p_InputImage,

const GreyscaleAlgorithms& p_Algorithm);

And our changes inside relate to our calculating the new pixel value before we assign it back to all three channels:

double l_a = 0;

switch (p_Algorithm)

{

case GreyscaleAlgorithms::Average:

l_a = DoubleToInteger(Average(l_r, l_g, l_b));

break;

case GreyscaleAlgorithms::Lightness:

l_a = Lightness(l_r, l_g, l_b);

break;

case GreyscaleAlgorithms::Luminosity:

l_a = Luminosity(l_r, l_g, l_b);

break;

}

int l_n = DoubleToInteger(l_a);

///<Summary>

/// Average Grey

///</Summary>

double GreyscaleHelper::Average(const unsigned char& p_red,

const unsigned char& p_green,

const unsigned char& p_blue)

{

int l_t = static_cast<int>(p_red)+

static_cast<int>(p_green)+

static_cast<int>(p_blue);

return l_t / 3.0f;

}

///<Summary>

/// Lightness grey

///</Summary>

double GreyscaleHelper::Lightness(const unsigned char& p_red,

const unsigned char& p_green,

const unsigned char& p_blue)

{

int l_max = std::max(

static_cast<int>(p_red),

std::max(

static_cast<int>(p_green),

static_cast<int>(p_blue)));

int l_min = std::min(

static_cast<int>(p_red),

std::min(

static_cast<int>(p_green),

static_cast<int>(p_blue)));

return (l_max + l_min) / 2.0f;

}

///<Summary>

/// Luminosity grey

///</Summary>

double GreyscaleHelper::Luminosity(const unsigned char& p_red,

const unsigned char& p_green,

const unsigned char& p_blue)

{

double l_red = 0.21 * static_cast<int>(p_red);

double l_green = 0.72 * static_cast<int>(p_green);

double l_blue = 0.07 * static_cast<int>(p_blue);

return (l_red + l_green + l_blue);

}

///<Summary>

/// Double rounded to integer

///</Summary>

int GreyscaleHelper::DoubleToInteger(const double& p_Value)

{

return static_cast<int>(std::round(p_Value));

}

This probably looks like a lot of gibberish, but no it really is Greyscale code... On this page... Is the complete source code for this portion of my work so far, but here are some examples from my original image...

Note: I can't remember where I got the original image - the internets obviously - so if this lovely bowl of fruit is yours, let me know... I'll give credit.

Average

Lightness

Luminosity

You can see the full code here: http://megalomaniacbore.blogspot.co.uk/p/grey-scale-c-code-with-cimg.html