Tuesday, 21 October 2025

On The Cleveland Bay Breed

Viewers of my YouTube channel will know I have horses, we lunge, ride, trap & carriage drive and generally really enjoy them, keeping them and ourselves entertained.  They are pets and massive parts of our lives.

The eagle eyed of those viewers will have noticed over the last two years we have been training our Cleveland bay, he arrived from the breeders Mr & Mrs Orange at Kenley Cleveland Stud.  His name is Nova.... Or Kenley Supernova, and we had to join the Cleveland Bay Horse Society to own him.


He is that rich bay colour with the striking black socks on each leg and a strong black main and tail, the absolute standard for the Cleveland Bay Breed; accept no less!  He arrived with some of the biscuit fringe colours of a foul and he has been slightly behind on his teeth changing from baby to adult, but otherwise he is a big boy, today measuring at least 16hh3 (hh = hands).
 
Nova's character is a quirkly one, for the most part he is all over humans, he investigates you and things around him... Being very interested and unlike many horses he is mindful, he doesn't just blindly panic but instead thinks about the challenges he encounters.

Again this is a trait of his breed, to be mindful, steady temptered and is one of the reasons they are so popular in the driving role today; notably for the Royal Family, if you see a bay coloured hosed with the black mane, tail & socks, pulling a carriage for the King, well it'll likely be a Cleveland Bay.
 
When we started training Nova it was literal baby steps, head collar training, leading him around the field, past various bric-a-brac:
 

Since that video was taken I have lost a bunch of weight.... Nova has not, he's grown and grown, his shoulders coming out and strengthening right now in October 2025.  You can see here is withers are just around my shoulder height, today they're up around my ear and he is just such a much bigger animal.
 
Once the head collar training was done we moved onto a bridle, at first with a plastic bit, but quickly he had an upgrade to steel, with a soft articulation in the middle:
 

All of this training was done in house by ourselves, we'd done much of this before with Gerty, but never with such a young horse, still growing, so it has been fascinating to see his mind growing with his skills and size.
 
The next step was to decide how to best approach him into driving.... And long reigns were the best way to move forward, with the help of a local trainer we began introducing him to new equipment:
 
 

Two sessions later and it was my turn to keep this practice up.... As you can see by the state of the grass this was in the burning hot summer of 2025.

So it was I found myself behind him rather than leading him...
 

Which is a very alien thing for a horse to experience commands from literally out of sight as you walk in their rear blind spot, however, the Cleveland Bat genes run strong in him and he was extremely comfortable with this extremely quickly as his quality shone.
 
Graduating to road practive soon after, and I took him out without our trainer just this very morning...
 

All of this is not just about owning a horse, nor just owning a Cleveland Bay, so many people claim their horse is a CB it's hard sometimes to bite ones tongue as a true CB is a very special animal, and you have to have the rare breeds passport, Cleveland Bay Horse Society member ship and all the bloodlines clearly defined to state your horse is a CB.
 
And the bloodlines are very important, Mr & Mrs Orange who bred Nova are massive figures in the breeding programmes, advocates for the breed and the CBHS.  Nova was gelded without saving any semen from him, sometimes I am sad about that, but we could simply not have had a stallion at our yard.
 
So instead of spreading his genes I write this post to spread the true word about the Cleveland Bay, our experience with Nova has been so rich and rewarding and we have decades to come with him.
 
He will drive out pulling a carriage and we hope to ride him too, weight loss programmes and will power willing!
 
He has been an absolute powerhouse of mental development, horse wise he is very clever.
 
And so I bring my post to a close with this post from Mrs Orange:
 


 

Saturday, 18 October 2025

On the Topic of "Smart Pointers" and the Factory Create Pattern

This is a topic I have visited a few times over the years:

2016, 2016 on SDL220192023 on control block, and I visited it professionally on the now cancelled AAA MMO Game Engine with Zenimax Online; the specific pattern within which came from my original 2016 post, just with a few tweaks and expansions over the years.

Its just one of those things which comes up time and again in the C++ space, much like my crusade against return early in performance code... cough.

And it came up again just this week, where given a coding test I immediately did not like seeing raw pointers being used, so quickly refactored it to use a factory create pattern, except I was in a rush and didn't have the luxury of a whole game engine backing me up.

So denuded of even my own post I quickly used a different approach, so use a hidden (private) member struct in the constructor.  The constructor remains public so std::make_shared or std::make_unique can access them without hinderance from the static create factory member function, but that a regular user can not use a flat "operator new" as they can not access the private struct.... Sounds complex, but it's really not, and you will even find this exact pattern on CPP Reference sites across the web, let us quickly output an example (and I'll assume you've taken a look at my prior post from 2016 above for the actual "shared pointer friendship" I would prefer).

struct Thing
{
private:
   struct HiddenInternal {};    // Just an empty struct
public:
    using Ptr = std::shared_ptr<Thing>;

    static Ptr Create()
    {
        return std::make_shared<Thing>(HiddenInternal{});
    }
    explicit Thing([[maybe_unused]] HiddenInternal internal)
   {
   }   
};

And so with this code we can only get a Ptr instance here from the static Create call.

Now I've set the scene, what am I asking myself?  I'm asking what is the compile time and what is the runtime effect of these two patterns?

Compiletime my suspicions are that the shared_pointer friendship stuff in my original post is more weight to carry, there's a macro in my more complex implementations, so that's pre-processor overhead, there's expansion of the code at compile time.

But there's also this HiddenInternal in this new, simpler, implementation based solve; I believe there will even be compiler ellision of the HiddenInternal as its empty and does nothing; I just need to observe and understand this topic better.

Lets start with Compiler Explorer and my first confirmative discovery, just as I suspected, the compiler is very smart smart and it will easily see a trivial class such as mine above, or one with a single member, and fold it away to nothing in compiler ellision.

And this is a great thing!  It proves at zero runtime overhead whilst we have code fully communicative of the intent, self documenting code is a wonderful thing, and this is an interesting and eye catching pattern in the code.... Such that I believe even a new set of eyes meeting it would appreciate there is something special intended and they would take care to use the class properly, avoiding memory leaks for us along the way.

Good code health is as important as the final result, I especially take pride in code which can run a long duration.  I've always worked on long lived project timelines.

So HiddenInternal can result in no overhead when the class is trivial, what about when the class is complex?

The first thing I notice is that the factory creation function itself generates no code, yet we benefit from its intent at compile time and in the usage.  There is of course the control block creation for the shared pointer in this instance and I hold that in my mind, but the cost to pay back in intent of the code far outweighs the control block.

The constructor itself is interesting.... 


Removing the HiddenInternal gives no difference, so the compile has still removed the empty class, but enforces it's use in the code... This is a total win-win... The cost at compile time, one compile time ellision in one translation unit which is neglegable.

The runtime overhead zero.

What about the preprocessor and friendship.... The friend keyword itself is zero cost, the wrapping of it in a macro is therefore arguably a different conversation to have.  Resolving a friendship however is not so cost effective, the friend keyword adds a symbol, typically with a cost of O(1).

The look up from any other resolve is therefore either going to be O(1) or O(log N) where N is the number of other appearances of the name, this look up is going to be the compile time cost.  In practical terms this is going to be a very small duration, perhaps even below measurable in the pantheon that is a build across multiple cores.

A friendship link itself however costs nothing in code generation time, so we do not hold up the compile per se, we do however add a cost in the symbol table build.

Once something is in the symbol table it will result in complexity when refactoring code, potentially more rebuilds if the friendships, or classes within change.

In conclusion, if I have the option to set up a friendship to the underlying control block (see 2016) I would, but equally to keep the local complexity low I would also consider the Hidden private member class as a "trick".

Friday, 10 October 2025

XVE Game Engine: Thinking about AOE Damage Effects

I'm back on my home game engine and I need an area damage effect, which can apply to all the members of a team/party.... The trouble?  How to synchronize this over a network connection.

First of all, I know for certain that the server running my game world knows the center of the effect, and it knows the radius of the effect.  And it does tick the effect application against each player whether their client can visually see the effect - it appears in their combat log text immediately.

So the mechanics are there, the clients however, all run at different rates, they're in different locations world wide, with different latencies to the servers source of truth.... And so I need to now think about how I am going to tackle the issue of each player seeing an effect.

Simplicity first, the effect is just going to be represented by a wireframe sphereoid, it'll be a smooth shaded sparkles dripping effect later.

This quiets my engine engineer smooth brain some what and I now just need to synchronize the expansion from the center to the maximal radius of the effect, this will turn up at the clients from the server at any time after the effect is registered as affecting the players on the server... This maybe unfair, however, there is reason behind this in the design - the player will know the effect is going to be cast as there is pre-warning from the caster... They need to learn that and move before they see the effect basically.... That is, they need to be smart.... I know I know this is a cheap answer, but I am all about cheap answers to problems today - and in my gameplay loop, seeing the caster "spit" this effect and moving before it expands into a cloud of hell... is pretty rewarding.

So I have a time point, on all clients they know when the effect begins, they know the maximal extent and they know the center of it....

What I am thinking about now is the client side only, so without any network traffic, an entity definition for the effect which the client has pre-loaded into some dictionary of effects to execute, and this contains a rate per millisecond from the start time and the effect system has to linear interpret along the line.

I am thinking about designing some form of easing curve authoring for the client effect system - and I'll likely reuse this in the animation system - whereby I can try different easing curves.


 The system interpreting along the line could be more than a lerp too, again that'll be down the line.

Replaying, reversing or otherwise on such a defined effect is going to be quite interesting as reversing the delta will allow me to contract it again.... Though time-line ordering is to be forwards chronologically only for my game play, that maybe a feature I want to play with in future.

What's the problem then?  Yeah, it sounds like I have a handle on what I want, except.... I've assumed all the way through my thinking that the effect is fixed, known and defined up front.... The problem?

I want to scale the effect with the power of the caster, and this is not known up front, sure I can just add a float and multiply by 0.5 for half power or anywhere along that scope, heck I could even scaled the enemies power by some easing function as well and combing the power function along that line with the rate of the effect.... So there are answers to be had.

My problem is randomness, when it's random, things look wrong, they don't feel right when playing them and the rewarding feeling I had from the fixed function gameplay doesn't turn up... It's in some fort of metaphysical uncanny valley which I can't quite explain doesn't feel right.

I will have to play about some I feel.... 

Thursday, 9 October 2025

A Thought On Poles & Swingletrees

I was intrigued recently by a video from LindyBeige about Roman War Waggons, during which he brought up that some folks argue about how a horse might have been tacked to a wagon in that era.

Where by some etchings show only a single "pole" attaching a wagon to oxen or horses.

I looked at the image and wondered perhaps if it was a full length trace, which would be on the wagon frame, but he was totally right about the perspective and the image clearly being fanciful.

I today therefore was having a think about whether swingletrees would have been in use, they certainly are in ploughs and oxen drawn materials from the medieval period; I've read and seen etchings which come from contemporary sources.

My driving position has a single articulated pole, but this is not for the driving power, that is just for direction; and the swingletrees, to the horse traces, they are the power.... so there's a very intricate interplay between all of them to keep things in balance....

And to be honest.... I reckon in some form, this would have existed just a few years after starting to use horses, let alone a few dozen millennia as the turn to the common era would have been (year 0 by the Gregorian Calendar).


 

Tuesday, 7 October 2025

Over the Fence.... And Far Away....

No, I'm not talking about horses this time.... Lets talk engineering, what is a fence?  A memory fence or barrier is used to order reads and writes around certain operations called Atomics.

This ordering ensures that both communication of the value from the register in the processor is correct between all the cores/threads running and the cache within the chip as a minimum, and may even leave to a full flush to the main memory (though that is far longer in cycle).

These can take tens to hundreds of cycles to leap the value over the barrier.

In C++ the most common two, or certainly which I encounter most often, are fetch_add and fetch_sub, using them to control everything from the shared count on a resource to the control gating on fixed sized "high performance" containers.

And there in lies the rub, these operations cost a lot of cycles just to increment or decrement a counter in many of my use cases, so why use them?

Well, that barrier within the chip, between the memory is very expensive if we compare it simply with the very cheap increment or decrement of a register itself, just the value in the register on the chip can change in a single operation instruction; sure it took others to load the register and it'll take yet more to store the result off, just as it would with the atomic; but on top of that you have no overhead in comparison with the atomic....

Until... Until you try to synchronize that flat increment or decrment, sure then that code is going to be far faster, however, it's not thread safe, not at all, the atomic already is (when ordered correctly)...

In order to protect a flat operation one therefore has to wrap a separate lock, or mutex, around it which is far far more costly than the atomic operation.  This difference is called the "contention cost", the contention cost of an atomic is simply in the number of steps, lets look at code:

The atomic addition, the CPU itself will execute

lock xadd [or similar]

This itself is a single instruction, it may result in multiple cycles of the CPU to complete, but it is a single instruction.  It ensures unique ownership of the cache line (usually 64kb) within which this variable resides, and means if you perform an operation anywhere in that cache line you will be making optimal operations.  As the CPU can perform all the atomic updates in that 64kb block without having to fetch another, this is really useful when there are a few cores (2-8 on average) accessing an area of memory and event holds up when scaling out to more cores.

A mutex however, has to be controlled wholly separately from the increment, so we may end up with C++ such as this:

std::mutex lock;

uint64_t count { 0 }; 

{

    std::lock_guard<std::mutex> lockGuard { lock };

     ++count;

The execution here will have to acquire the mutex in a harsh manner, internally this is an atomic; if the pathway here is lock-free, then the atomic operation underlying the mutex is the only added cost.  However, and this is a HUGE however, if there is contention, someone else already has the lock then this lock-guard has to spin wait... And it's the contention, the other thing having the mutex locked, which adds the cost.

So you're essentially gambling on whether you have a value not contested before the lock or not, and in both cases you take on the cost of an atomic operation; so for my codebase and it's uses across sub 32 core machines means that an atomic is much more efficient in most all my use cases.

A mutex however is far more useful when protecting more than a single register, when protecting a whole block of memory, a shared physical resource (like a disk) or just a complex structure you can only use a mutex around it.

All this sort of eluded me earlier this evening, I was in the middle of a technical conversation and I bought up the atomics backing a shared_pointer in C++ and immediately just sort of lost it, my memory drifted far far away and I have to admit to waffling some.

I even forgot about weak_ptr deriving from a shared_ptr and it's uses to "lock" a new copy of the shared_ptr and so passing ownership by weak pointers.

But it came from a very specific conversation about Unreal engine, about TSharedPtr... Not a structure I myself have used, and for the life of me I could not think why not, I just knew not to having been told...

And of course here I sit a few hours later and I know why, it's not thread safe... TSharedPtr in Unreal is not threadsafe, and why is it not?  Well because it does not protect its internal reference count with an atomic, no it's just a flat inc and dec of a count integer register "for performance purposes".

So sure if you're inside one Unreal system, on one thread, then yeah you can use the TSharedPtr, but it's utility is much reduced to my eye, and you would want to perhaps look at other ways to hold your resources in that thread, even in thread local storage rather than in the engine heap.

The moment that TSharePtr crosses a barrier, then you're far far away from thread safe.

So what do you use a TSharedPtr for?  The documentation says "To avoid heap allocation for a control block where possible"... Yet it contains an explicit reference count, which is in a control block, and underlying it is a flat "operator new" and it uses only default deletion via the flat operator delete.... So my non-Unreal expert brain says "Why use it at all".

Hence when asked earlier today my memory was over the fence and far far away.... Of course now, it's all returned and... yeah I just sat through two hours of TV mulling this over.... and had to come write something down before I went a little bit odder than usual.

Tomorrow let me regale you with my story about forgetting how to load a crash dump and the symbols files and explain myself properly, despite doing that particular duty about a hundred thousand times in my life.  Hey ho, a technical conversation in which I fouled up, one is only human.