cross-posted from: https://programming.dev/post/1086370

This time on my arbitrary blog: Entity component systems.

Also, highlight.js should degrade more gracefully without JS activated than last time. Note that I can’t process syntax highlighting in my build step, because I don’t have a build step.

EDIT: improved phrasing

  • For inspiration, I recommend looking at bevy. It does a great job with efficiently querying for components in a type-safe way. From their book, this is an example of what a system looks like:

    fn greet_people(query: Query<&Name, With<Person>>) {
        for name in &query {
            println!("hello {}!", name.0);
        }
    }
    

    Name and Person are both components.

    You might think that a scripting language for ECS-based systems is pointless. And in the majority of cases you would be right.

    This comes as a surprise to me. I think scripting capabilities would be incredibly useful, especially if you can hot reload the scripts during development time. For core engine functionality, this might be less relevant, but for the gameplay itself it could be really nice.

    You may wonder why the entity (usually some sort of ID) is part of its component. It wouldn’t be too convenient if we had to query and pass around entities manually.

    This one I’m curious about. How common is the case when you need to operate with the entity directly, and is it worth the cost of duplicating the entity ID onto each component? In bevy’s case, you can query for the entity ID like you would a component using Entity, which I’ve found to be easy enough to do.

    function click_handler(boundary: BoundaryComponent) {
         -- This creates a new bubble component for the entity stored in boundary.
         boundary with new BubbleComponent("Hi at {boundary.x}, {boundary.y}!");
    
         -- From here on we can use boundary with functions/methods/members
         -- registered for both components, because the type system knows.
         boundary.show_on_top();
    }
    

    Does this mean that each entity can have any number of components of a particular type in this implementation? Would each component also need its own ID to distinguish between two components of one type on the same entity?

    Another option here is that instead of creating a BubbleComponent that’s part of the same entity as the BoundaryComponent, it might make more sense to create a new entity that’s at the same position as the BoundaryComponent’s entity, possibly using some kind of relative transformation to keep the bubble at the same position as the boundary.

    The next example which seems to create two bubbles on the same entity is just as confusing to me. If I were to query for those bubbles, would I iterate over each of those bubbles in separate iterations, but all other components that are part of this query are the same? What about in situations where two or more component types have multiple instances per entity, like if one entity had two BubbleComponents and two BoundaryComponents? Would I iterate over the permutations of those entities, so (bubble 0, boundary 0), (bubble 0, boundary 1), (bubble 1, boundary 0), (bubble 1, boundary 1)?

    I like the ideas presented in the article otherwise. I vaguely remember TypeScript having some sort of reflection support that Angular took advantage of or something. I wonder if that can be used to create a scriptable ECS like proposed in this article?

    • I’ll split this up…

      For inspiration, I recommend looking at bevy.

      I did, it’s just that I do not consider programming in Rust scripting. Scripting is kind of a vague term, I admit, but to me, it has to fullfill roughly the following criteria:

      • fast to no compilation time
      • doesn’t need big SDK or setup that was required to build the engine
      • can be handed over to some graphics designer, level designer or admin

      So, for example:

      • Is Unity/Godot C# scripting? If I feel generous, probably not, though.
      • Is Bevy or Fyrox hot reloading scripting? Nah.
      • Is Scala command line “scripting” scripting. Uh-uh.
      • Is GDScript scripting? Yes, most likely.
      • Is [muh Lisp] scripting? Possibly.
      • Is Blueprint scripting? Yes.

      So, basically the only option that remains are embedded interpreters. More or less. With this in mind…

      This comes as a surprise to me. I think scripting capabilities would be incredibly useful, especially if you can hot reload the scripts during development time. For core engine functionality, this might be less relevant, but for the gameplay itself it could be really nice.

      As far as I understand, studios employ ECS for two reasons:

      • enhanced flexibility
      • enhanced performance for huge numbers of entities in components (this seems to be the case 99% of all times)

      So with the second use case in mind, an embedded interpreter seems kind of off the table. Even an embedded compiler might be off the table most of the times, although I’m not sure how tight performance requirements are. You’d practically have to implement what this thing promises, for a rather specific use case. So unless some major player puts money behind it I don’t see that happen.

      EDIT: I should add, I’m purely speculating about performance. But breaking a cache line to load an interpreter context several times sounds kind of meh.

      • I do not consider programming in Rust scripting.

        I don’t think most people do to be honest, I was providing it as reference because it’s a strongly typed ECS. One of the other challenges with scripts is they often don’t have static type checkers since they’re intended to be executed right away. mypy and TypeScript have helped tremendously with Python and JavaScript, but they’re still extra steps that need to be executed before you can run your code.

        an embedded interpreter seems kind of off the table.

        An embedded interpreter can still be highly performant, assuming it has a decent JIT compiler. Sending data between the host and the interpreter would be a concern, but it might be possible to allow the script to share memory with the host and directly access the components (while holding locks on those components’ storages). I tried experimenting with this a while back using WASI, but unfortunately the runtimes I played with (wasmer + wasmtime) weren’t yet mature enough from my experience to be realistically useful, and I couldn’t figure out how to get the modules to share memory with each other.

        I know there are people playing around with scripting capabilities in bevy though, so I’m sure this will be possible at some point. The other challenge, of course, is having a scheduler that’s flexible enough to handle dynamically added/removed systems, and systems which execute runtime-specified queries.

        Edit: I should add that a large part of the performance of an ECS comes from the ability of an ECS runtime to parallelize the systems. If your interpreter can execute the systems in parallel, then you still get to keep that benefit as long as your scheduler knows which systems are safe to run in parallel.

        • If your interpreter can execute the systems in parallel,

          Well, the way I thought of it was that the interpreter gets called in the systems in the first place, so that might become a problem.

          • I would definitely be concerned about the performance in this case, unless you can do some kind of AoT compilation of the scripts being executed (at least in production builds, development builds probably can’t do that if you need hot reloading). Unfortunately the execution time budget is really tight if you want to get frequent updates and decent frame times (although if the system in question runs parallel to the rest of your systems, for example, then it might have more time to operate with).

            That being said, with the interpreter pre-compiling those scripts, performance could still be really good. Sharing memory would become less relevant since your scripts would now operate on more than whatever’s in your component storages and could now, in theory, work with variables local to the system, assuming you’re meaning that the interpreter is called ad-hoc basically and isn’t the entirety of the system.

            This is all in theory of course, in practice you’d need to find an interpreter that supports pre-compiling (or at least pre-optimizing in some form) all the scripts you’ll want to run if you want to maximize performance.

            • assuming you’re meaning that the interpreter is called ad-hoc basically and isn’t the entirety of the system.

              That was my assumption. Basically, that interpreter would have to run small snippets here and there typically on an irregular base. Thanks for reminding, I should probably clarify that as well.

    • I like the ideas presented in the article otherwise. I vaguely remember TypeScript having some sort of reflection support that Angular took advantage of or something. I wonder if that can be used to create a scriptable ECS like proposed in this article?

      I don’t know, I’ve seen some outdated version of Angular only for a couple of hours in my job now. But I’m sure, those sweet layers of metaprogramming and DI will be a bliss to debug. Not.

    • This one I’m curious about. How common is the case when you need to operate with the entity directly, and is it worth the cost of duplicating the entity ID onto each component? In bevy’s case, you can query for the entity ID like you would a component using Entity, which I’ve found to be easy enough to do.

      Maybe I worded that poorly. I’m also still not entirely sure about the terminology. This was exactly supposed to be an implementation detail.

    • Does this mean that each entity can have any number of components of a particular type in this implementation?

      Yes. I vaguely remembered that some ECS can apparently do that. If not, you’d probably settle for a branch or an optional type instead.

      Would each component also need its own ID to distinguish between two components of one type on the same entity?

      I don’t see why, unless you’re planning to query and manipulate them later again.

      Another option here is that instead of creating a BubbleComponent that’s part of the same entity as the BoundaryComponent, it might make more sense to create a new entity that’s at the same position as the BoundaryComponent’s entity, possibly using some kind of relative transformation to keep the bubble at the same position as the boundary.

      The boundary is supposed to BE the position. So some rendering system would have rendered the speech bubble in the middle of the boundary. Maybe I should have called the boundary area instead…

      • Yes. I vaguely remembered that some ECS can apparently do that. If not, you’d probably settle for a branch or an optional type instead.

        I don’t doubt this is possible, but I’m really curious how querying would work. On the other hand, a component which essentially is just a wrapper for BubbleComponent[] is possible, but querying is straightforward since you’d just get the full list of BubbleComponents per iteration.

        The boundary is supposed to BE the position. So some rendering system would have rendered the speech bubble in the middle of the boundary. Maybe I should have called the boundary area instead…

        My idea behind using positions relative to the BoundaryComponent is along the lines of having each new “bubble entity” hold a reference to the “boundary entity”. Then you’d have a script which updates the transforms of the bubble entities to match that of the boundary entity:

        function inherit_parent_boundaries(
            child: BoundaryComponent & ParentReferenceComponent,
            boundaries: Query<BoundaryComponent>
        ) {
            -- This updates the child's boundary to match its parent's boundary
            child.boundary = boundaries.get(child.parent).boundary;
        }
        

        This would keep the bubbles as their own entities and avoid the need for a single entity to hold multiple of the same component, which I think would keep the ECS overall a lot simpler. This doesn’t account for parents of parents or anything like that, but if boundaries can be something like Query<BoundaryComponent & ParentReferenceComponent?>, you can recurse up the chain until you’ve updated all the ancestors as well, and all the leaves of the tree will eventually be updated by this system.

        function inherit_parent_boundaries(
            child: BoundaryComponent & ParentReferenceComponent,
            boundaries: Query<BoundaryComponent & ParentReferenceComponent?>
        ) {
            -- This updates the child's boundary and all its ancestors to match the boundary of the root of the ancestry tree
            for (
                var cur = child, var parent = boundaries.get(cur.parent);
                parent != null;
                cur = parent, parent = boundaries.get(cur.parent)
            ) {
                cur.boundary = parent.boundary;
            }
        }
        
    • The next example which seems to create two bubbles on the same entity is just as confusing to me. If I were to query for those bubbles, would I iterate over each of those bubbles in separate iterations, but all other components that are part of this query are the same?

      No that would be crazy.

      No, but seriously, the find operator is supposed to take only one type and not merge the types. ECS seems close enough to relational databases, but not that close.

      • In this case, would the components be combined into a list? Basically you’d have a BubbleComponent[] attached to the entity instead of just a BubbleComponent? Maybe I’m misunderstanding what the system is, is click_handler in the post a system, and if so, do systems only declare a single component of an entity as their input? From my experience, you often want to work with multiple components of the same entity in a system, for example:

        -- Mark characters with 0 or less life as dead
        function mark_dead(character: BoundaryComponent) {
            with character as movement: MovementComponent {
                character.x += movement.velocity.x * delta_secs;
                character.y += movement.velocity.y * delta_secs;
            }
        }
        

        Would there be a simpler way to query these components in the system, and what if I wanted to query for both BoundaryComponent and BubbleComponent, what would that look like?

        • Maybe I’m misunderstanding what the system is, is click_handler in the post a system, and if so, do systems only declare a single component of an entity as their input?

          The way I figured would make sense was that, in the engine/game itself, the BoundaryComponent would have an additional field for registered scripts or that there would be an additional component just for registered scripts, to keep components lean. Not sure if that actually worked out in reality.

          Then there would be a system for clicking on boundaries that would call such a script, if available. It’s probably a poor example, but since that system doesn’t touch much else, only the boundary component gets passed into the script. That’s not supposed to be a rule, though. I probably should clarify that on the post later on…