Tuesday 9 July 2024

Code Locality & Concurrent Systems

Let us talk Software Design, let us talk about code locality.  What do I mean by locality?  Well, in Software Engineering we often talk about code being self documenting, a fabulous place to be if the code performing the work is right in front of you.  But to be honest systems get pretty big pretty quickly.  So you're very much more likely to me making calls to API's or just bunched of functions you have to blindly trust, unless you have the leisure of digging into them.

And there's usually precious little time in development which is invariably taken up with writing new code, not going over the old (unless you're very lucky - but that's a conversation for another day).

Now, these API's can encapsulate large features, and they don't always achieve the amount of descriptive power we'd like in the call site location we're at.  I therefore advocate for a comment around that point.

And so we are immediately at odds with the wish for our code to be as accessible, local, as it can be, but also encapsulating the large unrelated tasks elsewhere.  For me this is the dichotomy of code locality in a nut shell.

Where can it come unstuck?  Well, back in the day (and I have to be honest with most engineers still thinking linearly even today) you could be forgiven to writing out your big system block diagram, connecting things with events or signals and just going about your business, when your code called "BigFooFunction" in "BarBarBlackBox" library you didn't pay it much mind.  Today however, as Moore's Law runs aground on the rocks of pumping ever more cores into a system we have to think in concurrent terms and it is in just such a scenario that I want to pick up thinking about Systems Design and Code Locality.

Let us perform a thought experiment.  We have a system which progresses from Eggs to Birds, it controls the state of the entity laid as an egg, being incubated, then hatched, fed and tended until they start to fledge and ultimately turn into a bird.  All this transmogrification of the state from Egg to hatchling to fledgling to bird happens asynchronously in the background in a big slap of system you do not have to worry about.

All you worry about is a signal coming into you saying "predator", and when this signal arrives you need to stimulate all the little birds you have into action, they all need to take flight.

for(auto& bird : flock)

{

   bird.TakeFlight();

}

This is the locality of our change to the property flight on each bird, we are absolutely unaware of each individual possible state the bird can be in and so we rely on the API and code backing our call to "TakeFlight".

Now, let us assume the members of "flock" are all of the base type "Bird", so they all have a TakeFlight function?  Well, they might, if Bird looked something like this:

class Bird

{
   public:

       virtual void TakeFlight() = 0;

};

Then at compile time all the derived classes would have to implement "TakeFlight".

class Egg : public Bird

{

    private:

        uint64_t mTimeLaid;

        float mTemperature;

    public:

        void Incubate();

        void TakeFlight() override;

};

And because we know this is an Egg we know it can't fly and so we know that this override of the function will do nothing.

This is perfect code-locality for that derived type, but for bird itself it leaves us the open question, well what does it do?  Does it do anything??  Should "TakeFlight" return us some code to indicate the bird took flight and an error if not, but an egg not flying is not an error, so must it be some compound it returns.

Now, I am straying into API design somewhat, and they are related fields, but really here we're thinking about where the active code sits, what is its locality compared to our callsite in the loop?

And for a function, you can see we can define this and know.

Now lets us change our example:

class Bird

{
    public:

       bool mInFlight { false };

};

and

class Egg : public Bird

{

    // As above

    public:

        void TakeFlight () override { 

            // Intentionally Empty 

        }

};

Our egg can not fly, or can it? for now with a public member we're flying somewhat in the wind of a contrived example, but anything can now set the value of mInFlight, our for-loop upon the predator signal can now achieve its aim:

for(auto& bird : flock)

{
   bird.mInFlight = true;

}

And this is correct, this loop in review would pass, who can argue?  The bird took flight, it did, it is true.  And this is code locality in action, for at the location we needed it the functionality was available to mutate the value and achieve our goals, no matter what the state of the rest of the system was.

This is a very dangerous place to be.

Especially with an asynchronous system, lets say this is not a trivial call, lets say that the call site is to loop through a series of resources and start them loading, and upon that call each is pending load, but not yet loaded?

for(auto& item : objects)

{
    item.StartLoad();

}

We can assume this code is starting the load, but now lets package this into some context, we like to have our code be self-documenting after all.

class Loader

{
    private:

        bool mEverythingReady { false };

        std::vector<Objects> mObjects;

    public:

        void Load()

        {

            for(auto& item : mObjects)

            {

                item.StartLoad();

            }

            mEverythingReady = true;

        }

   };

Can you already spot the problem?  The local code here is communicating that EVERYTHING READY, when it is anything but that, you have simply started some other action elsewhere, you have not checked the state, you have not deferred until ready, you have started load and that is all you know, but you code here locally is communicating something subtly different.

And in huge systems you must not fall foul of this kind of behaviour, you need your code locally to communicate what it intends, to do as it intends and if you spot silly public interfaces like this do not be affraid to fix them, the bravery to address an issue, if only to raise it to the owner, is a step in the right direction with massive software systems.

No comments:

Post a Comment