The Megalomaniac Bore: file

Showing posts with label file. Show all posts

Saturday, 21 July 2018

C++ : Coding Standards by Example #1 (with Boost Beast)

Today I've spent sometime doing a little different video, rather than just play about in code, whilst I played about I made a video of the time I had... Here is it...

We cover:

C++
Coding Standards
Structured Programming
Functional Programming
Encapsulation
Object Orientated Programming
Building the Boost Libraries (1.67.0)
Visual Studio 2017
Static Linking

And generally spend time seeing how I turn gobble-de-gook code into something usable and maintainable against my own Coding Standards.

Get the code yourself here: https://github.com/Xelous/boostBeast

Monday, 19 December 2016

Programming : Using Boost File System Directory Iterator to Find a File

PLEASE NOTE, THIS POST CONTAINS UPDATES TO PREVIOUS POSTS - BOOST 1.62.0 NO LONGER TAKES THE "MinGW" DIRECTIVE, TO USE MINGW YOU NOW ONLY NEED PASS THE "GCC" DIRECTIVE; see section 5.2.2 at this link...

Today I'm going to do some code, this is a code example with Code::Blocks using the Mingw compiler on Windows, to build the boost libraries and then use the boost.filesytem library to find a specific file from a given root, first just looking in that directory at all its files and the finally making it recurse down the tree of files.

First things first, lets create a folder to work in, I'm going to call it "FindaFile", inside this I'm going to create "ext" for external items, which is where I'll place & build boost; and I'll create "src" for source which is where our project file & source code will reside... Lets get started...

Next...

Extract the boost library, I happen to be using 1.62.0, which is the current version at the time of writing... And I then need a command prompt which knows the location of the compiler for mingw (i.e. we've added it to the PATH environment variable)...

We're preparing to build boost with the mingw toolset...

And then performing the boost build, which on this machine is going to take a long time as I only have 2gb of RAM... Hit that donate button to help me improve some of my machines!

So the path was set, the bootstrap is performed:

bootstrap gcc

and then I start the build

b2 toolset=gcc

Once the build is complete the two folders we will be using in our code-blocks project are the "boost" folder within the boost_1_62_0 folder; this contains all the header files for the boost library. Some of the libraries are header only, so just including this folder into your builds (on gcc with "-I /foldername/") is enough to use boost, items like "lexical_cast" are perfect examples of this.

The other folder is the "/stage/lib" folder, this will contain all the build library binary's. When you are using the file system it is not a header-only library, you must link against the "boost_filesystem" library file, and this is where it will reside!

You can leave the boost build running and open Code::Blocks now...

And create a new project, with a little bit of normal code, to check everything is working & we have set it to use C++11 (I would like to use a newer C++, but I only have this old compiler installed)....

Now, lets set the build options in the project...

I am going to set C++11, all warnings and stop on fatal errors...

Then I am going to set the compiler to look in the boost folder for the headers...

And now we tell the linker were to look for the libraries....

Our final step is to tell our program to use the filesystem library, when doing this you also need to use the system library... So lets switch to the linker settings and insert those libraries...

Now, the file name we are going to add is:

libboost_system-mgw47-mt-d-1_62.a

Lets break this down, from left to right we are told the this is part of the boost library "libboost" that is is the "system" sub-library, that it was build with "mingw v4.7", that is it the multi-threaded library, that it is built "debug" and it came from boost v1.62... You need to know this, because in our previous pages you will see we ONLY set our search directories and settings for the "Debug" version of the project... When you switch to the clean, smaller, faster "Release" build you will need to set everything again!

We add this library then into the linker settings, within the link library section... We need to add this file and the filesystem file...

The IDE might ask you to add the library as a relative path, this is directly pointing to the file, and I personally do not advise this, as we've set up the "search directories" we need only enter the filename into the library list.

Now, once the boost build is completed in the background, we can use the boost file system library in our code, and check for the presence of a file....

Lets write some code to take a directory to search and a file to search for at the command line and just output them, as a starting point....

We can go into "Project" on the menu to set the parameters for the program to some useful values... I have added a "Help" function already to point out when a mistake is made... I am going to just check my code by running it without parameters, then with bad parameters and finally with a valid folder name & filename... The targets I am using is "C:\Code" a folder I know exists, and "Program.cs" to look for all the program C# files I have in that folder...

Lets see our program output at this point...

Our first calls to the boost library now are going to turn the search directory string into a path, which is easier to work with in the boost library, you don't have to perform this step most all the boost filesystem library functions will automatically cast strings or cstrings or wstrings you pass to them into boost::filesystem::path or wpath instances on the fly, however, it's in the long run quicker for your code if you convert them into paths once, rather than have the library create and throw away instances of paths over and over as you do various calls.

We will also need to check that the search directory is indeed a directory to start off from...

Next we need to iterate through the directory and for each file check if its name is a match... I am going to put this into a function straight away, so we can call into the function whenever we meet a sub-folder we can automatically queue if for searching as well....

And the code for this Search Function looks like this:

void SearchDirectory(const boost::filesystem::path& p_Directory, const std::string& p_Filename)

{

std::cout << "Searching [" << p_Directory << "]...\r\n";

std::cout.flush();

auto l_Iterator = boost::filesystem::directory_iterator (p_Directory);

// A blank iterator is the "end" point

auto l_End = boost::filesystem::directory_iterator();

for ( ; l_Iterator != l_End; ++l_Iterator)

{

// This is the type "boost::filesystem::directory_entry"

auto l_DirectoryEntry = (*l_Iterator);

// Look for subdirectories, files or errors....

if ( boost::filesystem::is_directory(l_DirectoryEntry) )

{

// Recurse down into the sub tree

SearchDirectory(l_DirectoryEntry, p_Filename);

}

else if ( boost::filesystem::is_regular_file(l_DirectoryEntry) )

{

std::cout << "Found File... ["

<< l_DirectoryEntry.path().string() << "]\r\n";

// We need to just have

// Regular files are NOT symlinks, or short cuts etc...

// So we look for a match!

if ( l_DirectoryEntry.path().filename().string() == p_Filename )

{

// Notice here that we call to get a "path"

// and then a "string" from that path!

std::cout << "!!!! Match Found ["

<< l_DirectoryEntry.path().string() << "] !!!\r\n";

}

else

{

// Unknown directory or file type

std::cout << "Error, unknown directory entry type\r\n";

}

std::cout.flush();

}

You will notice we receive a "directory_entry" from the "directory_iterator", then we get the "path" out of that, and finally the "filename" from that path... We can output any of them along the way, but we only compare the filename with the search pattern.

If the "directory_entry" was found to be a directory itself we simply recurse into another search.

This code now works....

This completes this little tutorial, if you found it of some use, please follow the blog, if you really really liked it and want to help me develop more ideas, or suggest more ideas, the donate button and e-mail link are at the top right!

Good Luck!

Friday, 28 October 2016

Administrator : Using Python to Serve Files (HTTP)

The second in my mini-series of how to share storage between machines, easily, we're going to look at using Python as a Simple HTTP Server...

Linux

On Linux, with at least Python version 2.15.x (use "python --version" to check) you can simply run:

python -m SimpleHTTPServer

And the current folder will be served up on the primary ethernet controller on port 8080.

This is extremely useful to let some remote machine pull files quickly off of a system, and it's a very good technique to remember when you're developing and deploying, because you can just host your "/bin/debug" or "/bin/release" directory to the remote system, and when your builds complete that remote side can pull the new files or images over.

To do the fulling on Linux, I prefer to use wget, so lets assume the above folder is "/home/xelous/share" inside it is a file: "hello.txt", and the IP is 123.0.0.1, this is the wget from the remote machine:

wget http://123.0.0.1:8080/hello.txt

And voila, the file is whisked as a HTTP download across to the remote machine's current folder.

You can write scripts to pull lots of files over and then do builds, use a makefile and you can do builds from your code quickly as you carry on working, this is very useful in my set up as I have an 8 core laptop I can use to kick builds off on, whilst my local workstation can carry on doing another build. When you're producing ARM kernel builds for two different platforms at the same time molding this simple server and wget to your whim streamlines your development speed so so much!

Windows

On windows you have to have a command prompt with the path to python set, lets assume our python is installed in "C:\Python":

PATH=%PATH%;C:\Python

Then start the server from the "web" folder:

cd \web

python -m http.server 8080

This does exactly the same as the linux version, except now we're hosted on Windows, and sharing the "C:\Web" folder on our server.

Browser

You can browse straight to both of these servers and just see all the files & folders too, simply browse to: http://123.0.0.1:8080/

Why does this exist?

I had a Windows machine which was on a "secure" network, and on that machine I needed to pull a lot of files over to a Linux workstation, I had no rights to create a network share on the Windows machine, and I didn't want to copy everything off onto USB or over the network; because I'd have been creating ghostly copies of all the files on those remote and movable storage intermediaries.

So for security and integrity I wanted to get the files as straight from A to B as possible.

The Windows machine had Python installed, so opening a command prompt, I found the python exe in "/users/myself/AppData/Local/Programs/Python", so set the Path as above, then moved to the root of the system and started the server.

On the Linux machine I had a simple Python script which called the server "index.html", which was just the file & folder list and then this python script crawled the downloaded index and called "wget" on each file, or "mkdir" for every folder... And I re-cursed down the tree...

My next post will be that very script... Because I am nice like that!

Security Lesson

To any system administrators out there... This is a loop hole on ALL machines running python, take a look if you need to stop this happening!

Friday, 1 July 2016

Software Engineering : My History with Revision Control (Issues with Git)

I'm sure most of you can tell I'm one of those developers whom has been using Revision control systems for a long time... So long in fact, that I wrote a program for my Atari ST; whilst at college; which would span files across multiple floppy disks and use a basic form of LZH to compress them.

Later, when I graduated I worked for a company using a home-brew revision control system, imaginatively called "RCS". Which basically zipped the whole folder up and posted it to their server, or unzipped it and passed it back to you, there was no way to merge changes between developers, it was a one task, one worker, one at a time system; almost as lacking in use as my floppy based solution from six years prior.

During my years at university, revision control was not a huge issue, it was NEVER mentioned, never even thought about. Yet today we, sometimes happily, live in a world where software engineers need to use revision control. Not only to ensure we keep our code safe, but to facilitate collaborative working, to control the ever growing spread of files and the expanding scope of most all projects beyond the control of a single person.

Now, I came to using professional grade revision control with subversion, in early 2004. I think we were a very early adopter of Subversion in fact, and we spent a lot of time working with it.

If you've ever taken a look around my blog posts you will see Subversion is mentioned and tutorials exist for it all befitting for nearly twelve years working with it. And unlike the comments made by Linus Torvalds I totally believe Subversion works, and works well. It is not perfect, but I find it fits my ways of working pretty well.

Perhaps after twelve years my ways of working have evolved to adopt subversion and vice versa, but whatever the situation, I'm currently being forced down the route of using git a lot more.

Now, I have no issues with git when working with them locally, ALL my issues are with using git remotely, firstly the person whom (in the office) elected to use git, started off working alone, he was creating a compiler project with C# and so he just had it all locally and used Visual Studio plug-ins to push to a local repo, all was fine.

I've used git with local repos without problem.

All the problems come with pulling and pushing, with remote, and controlling that access. Git intrinsically fails to protect the access to the repo easily, relying instead on the underlying operating system. Which is fine, when you have a controlled, and easy to manage user base as with a Linux server, however, with the minefield of integrating with Active Directories, domains and whatever on windows based infrastructure nothing but problem comes up.

The next problem I've had with Git has been the handling of non-mergable files. We have lots of digital files, movies, sounds and plenty of graphics. As such we've had to work around git, by having people work on files one at a time, and to cross reference which files they are responsible for. With an art crew of five people, this means a flip chart or white board is constantly listed with the media files, and someones initials are next to it, just to help control access.

"Surely git should be able to lock these files", they constantly cry. No, how can it, how can a distributed control system manage locks across five or more repos's which are not talking to one another, and if you did elect one to be considered the master, how do you then transmit out to the passive clients every time you lock or release a file? You can't, the artists would each have to remember to pull, or shout to each other to pull now! It simply doesn't work.

And as a way of working the white board is pretty poor, but it's all we have right now.

The next problem we had was the massive amount of disk space being used by the repos. We boot our machines off of very small (128GB) drives, then use either NAS or SAN for our main storage. This was fine, and efficient, and critically it was all well backed up on the infrastructure we use, and it worked for twelve years with subversion. However, with Git our huge files are constantly being snapshotted, this growth in the size of the overall repo is replicating files over and over and over.

In short, despite someone else, and the world at large turning its back on Subversion, we here in my area are strongly drifting back to Subversion.

Trouble is, it feels as though we're swimming against the tide, despite all these slight deficiencies in Git, the over all organisation; and even external projects I'm working on; are pushing Git. Torvalds himself calls people still working on Subversion "brain dead". But has he thought about the short comings? Or these case-studies we can give where subversion is a better fit for our working style?

Above all this wrangling internally has been my problem expressing our situation with Git to both the initiated and uninitiated. When talking to those advocates of Git, all sorts of acronyms, actions and comments are made "use git this", "use git that". The problem being, there are something like about 130+ commands in Git, that's a huge amount of things you can work with. But, we can break down what we've done as "git init", "git checkout", "git add", "git commit", "git push", "git pull" and "git status" (as I've said merging utterly failed, so I'll gloss over that right now).

Given this huge scope of possible usage, and such a small exposure experience it's hard to put words against why things were not a good fit with Git, the initiated always seem to argue "you didn't give it a good crack of the whip". But we don't work in an environment where we can try one thing and then another, it's an old working structure, which has evolved over time, people are used to it; and I'm nearly 40 yet I'm the youngest guy here! Training those around me in new ways of working is very much an uphill struggle. So, when introducing something as alien to their mindset as Git, it was always a loosing battle.

To express this to the uninitiated is even harder, they don't know what an RCS does, nor what we mean by centralised or distributed control, they just want to see our work kept safe, and our work to be released to the customer. Gripes about Git and Subversion make no in roads with them, they're just unimpressed when you explain that these solutions are both open source and have no support. The fact that they're free has been wildly ignored, yet I could - for the price of the support contract of another system here - easily buy and operate a whole new SAN just for our needs!

Lucky for me though, after struggling with this issue, I ran across Peter Lundgren's post on the same topic, of expressing what's wrong with Git. He doesn't advocate Subversion, or anything, over Git, he just lists the problems he had with Git, and he crosses much of the same ground I have had to.

Check out his post here: http://www.peterlundgren.com/blog/on-gits-shortcomings/