Tuesday, 14 July 2015

DOD-ECS: Decoupling Systems is good for Concurrency, and good for Cache

 FireStorm ECS implements a kind of distributed messaging service, fed by a central Dispatch mechanism: each System provides a Message Queue to receive Messages sent from other Systems, and addressed to one or more of its Components.

Just before each ECS System is Updated, any Messages sent to that System are Delivered, in terms of actually being Processed.
Messages are the primary way for the ECS Systems to send data to other Systems.
Since each System is updated separately, and has its own Message Queue,
the Message paradigm effectively decouples Systems from one another, which is good for Concurrency - the implications are that it's safe for multiple Threads to update different Components owned by the same System, and it's always safe to Read from Components of Another System, with no need for any kind of locking, other than a 'barrier' to ensure only one System is updated at any moment in time.

System Updates are the primary cause of messages getting sent to other Systems.
Since each System processes its Messages before updating its Components,
we usually want to send Messages to Systems that are of lower priority (are yet to be Updated).
The reason is that Messages sent to higher-priority Systems won't be processed until the next Frame... so the trick is to try to arrange your system updates rationally (which I impose through ordered system ids).

So - why have a Message QUEUE at all? Why not just process ECS events as soon as they occur?
Messages are used to convey data from a component in one System, to a component in another System, usually for Writing locally within that remote component.
They are a means of implementing 'deferred-write' thread safety!
Is concurrency the only benefit to be gained from buffering inter-System messages?

We also benefit in at least one other important way: we get a much better data access pattern,
leading to much improved cache performance, which is perhaps the number one performance killer in modern games.

Think about it: if a System processed ECS events as we generated them, we would be immediately touching data
owned by other Systems in a non-linear fashion, which implies lots of cache misses.
We would later go on to process that System, by which time any cache warming we might have gained has likely been lost.

Rather, if each System first deals with messages sent to its components, then updates its components,
we get the full benefit of cache-warming just before our full linear component update pass.
If we're careful, the overhead we pay for this should be a lot less than the performance gained through
the application of cache-friendly data access patterns.

No comments:

Post a Comment