Efficient way of matching entities to systems in an entity component system

问题

I'm working on a data-oriented entity component system where component types and system signatures are known at compile-time.

An entity is an aggregate of components. Components can be added/removed from entities at run-time.

A component is a small logic-less class.

A signature is a compile-time list of component types. An entity is said to match a signature if it contains all component types required by the signature.

A short code sample will show you how the user syntax looks and what the intended usage is:

// User-defined component types.
struct Comp0 : ecs::Component { /*...*/ };
struct Comp1 : ecs::Component { /*...*/ };
struct Comp2 : ecs::Component { /*...*/ };
struct Comp3 : ecs::Component { /*...*/ };

// User-defined system signatures.
using Sig0 = ecs::Requires<Comp0>;
using Sig1 = ecs::Requires<Comp1, Comp3>;
using Sig2 = ecs::Requires<Comp1, Comp2, Comp3>;

// Store all components in a compile-time type list.
using MyComps = ecs::ComponentList
<
    Comp0, Comp1, Comp2, Comp3
>;

// Store all signatures in a compile-time type list.
using MySigs = ecs::SignatureList
<
    Sig0, Sig1, Sig2
>;

// Final type of the entity manager.
using MyManager = ecs::Manager<MyComps, MySigs>;

void example()
{
    MyManager m;

    // Create an entity and add components to it at runtime.
    auto e0 = m.createEntity();
    m.add<Comp0>(e0);
    m.add<Comp1>(e0);
    m.add<Comp3>(e0);

    // Matches.
    assert(m.matches<Sig0>(e0));

    // Matches.
    assert(m.matches<Sig1>(e0));

    // Doesn't match. (`Comp2` missing)
    assert(!m.matches<Sig2>(e0));

    // Do something with all entities matching `Sig0`.
    m.forEntitiesMatching<Sig0>([](/*...*/){/*...*/}); 
}

I'm currently checking if entities match signatures using std::bitset operations. The performance, however, quickly degrades as soon as the number of signatures and the number of entities increase.

Pseudocode:

// m.forEntitiesMatching<Sig0>
// ...gets transformed into...

for(auto& e : entities)
    if((e.bitset & getBitset<Sig0>()) == getBitset<Sig0>())
        callUserFunction(e);

This works, but if the user calls forEntitiesMatching with the same signature multiple times, all entities will have to be matched again.

There may also be a better way of pre-caching entities in cache-friendly containers.

I've tried using some sort of cache that creates a compile-time map (implemented as std::tuple<std::vector<EntityIndex>, std::vector<EntityIndex>, ...>), where the keys are the signature types (every signature type has a unique incremental index thanks to SignatureList), and the values are vectors of entity indices.

I filled the cache tuple with something like:

// Compile-time list iterations a-la `boost::hana`.
forEveryType<SignatureList>([](auto t)
{
    using Type = decltype(t)::Type;
    for(auto entityIndex : entities)
        if(matchesSignature<Type>(e))
            std::get<idx<Type>()>(cache).emplace_back(e);
});

And cleared it after every manager update cycle.

Unfortunately it performed more slowly than then "raw" loop shown above in all of my tests. It also would have a bigger issue: what if a call to forEntitiesMatching actually removes or adds a component to an entity? The cache would have to be invalidated and recalculated for subsequent forEntitiesMatching calls.

Is there any faster way of matching entities to signatures?

A lot of things are known at compile-time (list of component types, list of signature types, ...) - is there any auxiliary data structure that could be generated at compile-time which would help with "bitset-like" matching?

回答1:

For matching, are you checking for each component type one bit at a time? You should be able to plow through the entities by checking if all components for a signature are available in one instruction against a bit mask.

For example, we might use:

const uint64_t signature = 0xD; // 0b1101

... to check for the presence of components 0, 1, and 3.

for(const auto& ent: entities)
{
     if (ent.components & signature)
          // the entity has components 0, 1, and 3.
}

That should be fast as hell if your entities are stored contiguously in an array. It doesn't matter, performance-wise, whether you are checking for 3 component types or 50 component types at a time. I don't use this type of rep for my ECS, but definitely this shouldn't take long at all even if you have a million entities. That should complete in the blink of an eye.

It's pretty much the fastest practical way achievable to see what entities provide a given set of components while storing the minimal amount of state -- only reason I don't use this rep is because my ECS revolves around a plugin architecture where people register new component types and systems at runtime through plugins and scripts so I can't effectively anticipate an upper bound on how many component types there will be. If I had a compile-time system like yours that is designed to anticipate all this stuff in advance, then definitely I think this is the way to go. Just don't check one bit at a time.

You should be able to easily process a million components several hundred times a second using the above solution. There are people achieving similar rates applying CPU image filters processing that many pixels multiplied by that many times per second, and those filters have far more work to do than a single bitwise and and a single branch per iteration.

I also wouldn't even bother caching this super cheap sequential loop unless you have some systems that only want to process, say, a dozen out of a million entities. But at that point you can potentially cache those rare case scenarios where one system barely processes any entities local to the system instead of trying to cache things centrally. Just make sure systems that are interested can find out when entities are added to the system or removed from the system so that they can invalidate their local cache.

Also you don't necessarily need fancy metaprogramming for these signatures. At the end of the day you aren't really saving anything using template metaprogramming since it can't avoid looping through entities checking for something since the list of entities is only known at runtime. There's not really theoretically anything worth optimizing away at compile-time here. You can just do like:

static const uint64_t motion_id = 1 << 0;
static const uint64_t sprite_id = 1 << 1;
static const uint64_t sound_id = 1 << 2;
static const uint64_t particle_id = 1 << 3;
...

// Signature to check for entities with motion, sprite, and 
// particle components.
static const uint64_t sig = motion_id | sprite_id | particle_id;

If you use entity-associated bits to indicate what components an entity has, I recommend setting an upper bound on the total number of component types your system can handle (ex: 64 is probably plenty, 128 is a boatload) so that you can just check for components against bitmasks like these in one go.

[...] what if a call to forEntitiesMatching actually removes or adds a component to an entity?

If, say, you have a system which is adding/removing components every frame, then I wouldn't even bother to cache in the first place. The above version should be able to loop through entities super fast.

The worst-case scenario of looping through all entities sequentially is if you have a system that, say, is only going to process 3% of those entities. If your engine design has systems like that, then it's a bit awkward, but you might just notify them when components are added/removed that they are specifically interested in at which point they can invalidate their entity cache and then recache the next time the system kicks in. Hopefully you don't have a system adding/removing components every single frame of the types that are that 3% minority of components. If you do have that worst-case scenario though, probably best not to bother with caching at all. There's no use for a cache which is just going to be discarded every frame and trying to update it in a fancy way is probably not going to net you much.

Other systems which, say, process 50% or more of the entities probably shouldn't even bother caching, since the level of indirection is probably not worth it over just plowing through all the entities sequentially and doing a dirt cheap bitwise and over each one.

回答2:

Have you considered the following solution? Each signature will have a container of entities that match that signature.

When a component is added or removed you need to update the relevant signature container.

Now the function can simply go to the signature entity container and execute the function for each entity.

回答3:

Depending on the ratio of add/remove of components against signature matching, you can try to build a kind of prefix tree storing references to entities.

The tree itself is static, only leaves which contains entities are runtime built containers.

This way when adding (or removing) a components, you just need to move the reference of the entity to the correct leaf.

When searching for entities matching a signature, you just have to take all the union of the leaves subsuming the signature and iterate over them. And since the tree is (almost) static, you don't even need to search for these leaves.

Another nice point: you bitset can be used to represent the path in the tree so moving an entity is pretty easy.

If the number of components induces an unrealistic number leaves, and not all combination of components are used, you can replace the tree with a hashtable where the bitset is the key and the value is the set of entity references.

That's more algorithmic ideas than concrete stuff, but it seems more reasonable than iterating over the set of entities.

回答4:

Another option which is a bit influenced from @Marwan Burelle idea.

Each component will hold a sorted container of entities which have that component.

When looking for entities that match a signature you need to iterate over the component container of entities.

Adding or removing is O(nlogn) since it needs to be sorted. but you only need to add/remove it to/from a single container which will also contain fewer items.

Iterating over the items is a bit heavier since it is a factor of the amount of components and the number of entities in each component. You still have an element of multiplying but the number of elements is again smaller.

I wrote down a simplified version as a POC.

Edit: My previous version had some bugs, now hopefully they are fixed.

// Example program
#include <iostream>
#include <string>
#include <set>
#include <map>
#include <vector>
#include <functional>
#include <memory>
#include <chrono>

struct ComponentBase
{
};

struct Entity
{
    Entity(std::string&& name, uint id)
        : _id(id),
        _name(name)
    {
    }

    uint _id;
    std::string _name;
    std::map<uint, std::shared_ptr<ComponentBase>> _components;
};

template <uint ID>
struct Component : public ComponentBase
{
    static const uint _id;
    static std::map<uint, Entity*> _entities;
};

template <uint ID>
std::map<uint, Entity*> Component<ID>::_entities;

template <uint ID>
const uint Component<ID>::_id = ID;

using Comp0 = Component<0>;
using Comp1 = Component<1>;
using Comp2 = Component<2>;
using Comp3 = Component<3>;

template <typename ...TComponents>
struct Enumerator 
{
};

template <typename TComponent>
struct Enumerator<TComponent>
{
    std::map<uint, Entity*>::iterator it;
    Enumerator()
    {
        it = TComponent::_entities.begin();
    }

    bool AllMoveTo(Entity& entity)
    {
        while (HasNext() && Current()->_id < entity._id)
        {
            MoveNext();
        }

        if (!Current())
            return false;
        return Current()->_id == entity._id;
    }

    bool HasNext() const
    {
        auto it_next = it;
        ++it_next;
        bool has_next = it_next != TComponent::_entities.end();
        return has_next;
    }

    void MoveNext()
    {
        ++it;
    }

    Entity* Current() const
    {
        return it != TComponent::_entities.end() ? it->second : nullptr;
    }
};

template <typename TComponent, typename ...TComponents>
struct Enumerator<TComponent, TComponents...>
{
    std::map<uint, Entity*>::iterator it;
    Enumerator<TComponents...> rest;

    Enumerator()
    {
        it = TComponent::_entities.begin();
    }

    bool AllMoveTo(Entity& entity)
    {
        if (!rest.AllMoveTo(entity))
            return false;

        while (HasNext() && Current()->_id < entity._id)
        {
            MoveNext();
        }
        if (!Current())
            return false;
        return Current()->_id == entity._id;
    }

    bool HasNext() const
    {
        auto it_next = it;
        ++it_next;
        bool has_next = it_next != TComponent::_entities.end();
        return has_next;
    }

    void MoveNext()
    {
        ++it;
    }

    Entity* Current() const
    {
        return it != TComponent::_entities.end() ? it->second : nullptr;
    }
};

template <typename ...TComponents>
struct Requires
{

};

template <typename TComponent>
struct Requires<TComponent>
{
    static void run_on_matching_entries(const std::function<void(Entity&)>& fun)
    {
        for (Enumerator<TComponent> enumerator; enumerator.Current(); enumerator.MoveNext())
        {
            if (!enumerator.AllMoveTo(*enumerator.Current()))
                continue;
            fun(*enumerator.Current());
        }
    }
};

template <typename TComponent, typename ...TComponents>
struct Requires<TComponent, TComponents...>
{
    static void run_on_matching_entries(const std::function<void(Entity&)>& fun)
    { 
        for (Enumerator<TComponent, TComponents...> enumerator; enumerator.Current(); enumerator.MoveNext())
        {
            if (!enumerator.AllMoveTo(*enumerator.Current()))
                continue;
            fun(*enumerator.Current());
        }
    }
};

using Sig0 = Requires<Comp0>;
using Sig1 = Requires<Comp1, Comp3>;
using Sig2 = Requires<Comp1, Comp2, Comp3>;

struct Manager
{
    uint _next_entity_id;

    Manager()
    {
        _next_entity_id = 0;
    }

    Entity createEntity() 
    { 
        uint id = _next_entity_id++;
        return Entity("entity " + std::to_string(id), id); 
    };

    template <typename Component>
    void add(Entity& e)
    {
        e._components[Component::_id] = std::make_shared<Component>();
        Component::_entities.emplace(e._id, &e);
    }

    template <typename Component>
    void remove(Entity& e)
    {
        e._components.erase(Component::_id);
        Component::_entities.erase(e._id);
    }

    template <typename Signature>
    void for_entities_with_signature(const std::function<void(Entity&)>& fun)
    {
        Signature::run_on_matching_entries(fun);
    }

};

int main()
{
    Manager m;
    uint item_count = 100000;
    std::vector<Entity> entities;
    for (size_t item = 0; item < item_count; ++item)
    {
        entities.push_back(m.createEntity());
    }

    for (size_t item = 0; item < item_count; ++item)
    {
        //if (rand() % 2 == 0)
            m.add<Comp0>(entities[item]);
        //if (rand() % 2 == 0)
            m.add<Comp1>(entities[item]);
        //if (rand() % 2 == 0)
            m.add<Comp2>(entities[item]);
        //if (rand() % 2 == 0)
            m.add<Comp3>(entities[item]);
    }

    size_t sig0_count = 0;
    size_t sig1_count = 0;
    size_t sig2_count = 0;

    auto start = std::chrono::system_clock::now();
    m.for_entities_with_signature<Sig0>([&](Entity& e) { ++sig0_count; });    
    m.for_entities_with_signature<Sig1>([&](Entity& e) { ++sig1_count; });
    m.for_entities_with_signature<Sig2>([&](Entity& e) { ++sig2_count; });

    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start);
    std::cout << "first run took " << duration.count() << " milliseconds: " << sig0_count << " " << sig1_count << " " << sig2_count << std::endl;

    for (size_t item = 0; item < item_count; ++item)
    {
        if (rand() % 2 == 0)
            m.remove<Comp0>(entities[item]);
        if (rand() % 2 == 0)
            m.remove<Comp1>(entities[item]);
        if (rand() % 2 == 0)
            m.remove<Comp2>(entities[item]);
        if (rand() % 2 == 0)
            m.remove<Comp3>(entities[item]);
    }

    sig0_count = sig1_count = sig2_count = 0;

    start = std::chrono::system_clock::now();
    m.for_entities_with_signature<Sig0>([&](Entity& e) { ++sig0_count; });    
    m.for_entities_with_signature<Sig1>([&](Entity& e) { ++sig1_count; });
    m.for_entities_with_signature<Sig2>([&](Entity& e) { ++sig2_count; });

    duration = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start);
    std::cout << "second run took " << duration.count() << " milliseconds: " << sig0_count << " " << sig1_count << " " << sig2_count << std::endl;
}

回答5:

Having a sparse integer set per signature type is the theoretically best solution (in terms of time complexity).

The sparse integer set data structure allows efficient contiguous O(N) iteration over the stored integers, O(1) insertion/removal of integers, and O(1) querying for a specific integer.

The per-signature sparse integer set would store all entity IDs associated with that specific signature.

Example: Diana, an open-source C and C++ ECS library, makes use of sparse integer set to keep track of entities in systems. Every system has its own sparse integer set instance.

回答6:

Regarding the pseudo code:

for(auto& e : entities)
    for(const auto& s : signatures)
         if((e.bitset & s.bitset) == s.bitset)
             callUserFunction(e);

I am unsure why you need the inner loop.

If you have the requested signature in the function then you can get the bitset of that signature and there is no need to iterate over all of the signatures.

template <typename T>
void forEntitiesMatching(const std::function<void(Entity& e)>& fun)
{
    for(auto& e : entities)
        if((e.bitset & T::get_bitset()) == T::get_bitset())
             fun(e);
}

来源：https://stackoverflow.com/questions/32035462/efficient-way-of-matching-entities-to-systems-in-an-entity-component-system

标签

c++

performance

optimization

c++14

entity-component-system