How to write a thread-safe and efficient, lock-free memory allocator in C?

前端 未结 3 916
伪装坚强ぢ
伪装坚强ぢ 2021-01-31 12:30

How to write a thread-safe and efficient, lock-free memory allocator in C? By efficient I mean:

  1. Fast allocation & deallocation

  2. Optimal memor

相关标签:
3条回答
  • 2021-01-31 13:13

    You can use a lock free list and a couple of buckets of differing sizes.

    So :

    typedef struct
    {
        union{
            SLIST_ENTRY entry;
        void* list;
    };
    byte mem[];
    } mem_block;
    
    typedef struct
    {
        SLIST_HEADER root;
    } mem_block_list;
    
    #define BUCKET_COUNT 4
    #define BLOCKS_TO_ALLOCATE 16
    
    static mem_block_list Buckets[BUCKET_COUNT];
    
    void init_buckets()
    {
        for( int i = 0; i < BUCKET_COUNT; ++i )
        {
            InitializeSListHead( &Buckets[i].root );
            for( int j = 0; j < BLOCKS_TO_ALLOCATE; ++j )
            {
                mem_block* p = (mem_block*) malloc( sizeof( mem_block ) + (0x1 << BUCKET_COUNT) * 0x8 );
                InterlockedPushEntrySList( &Buckets[i].root, &p->entry );
            }
        }
    }
    
    void* balloc( size_t size )
    {
        for( int i = 0; i < BUCKET_COUNT; ++i )
        {
            if( size <= (0x1 << i) * 0x8 )
            {
                mem_block* p = (mem_block*) InterlockedPopEntrySList( &Buckets[i].root );
                p->list = &Buckets[i];
            }
        }
    
        return 0;   // block to large
    }
    
    void  bfree( void* p )
    {
        mem_block* block = (mem_block*) (((byte*)p) - sizeof( block->entry ));
        InterlockedPushEntrySList( ((mem_block_list*)block)->root, &block->entry );
    }
    

    SLIST_ENTRY, InterlockedPushEntrySList, InterlockedPopEntrySList, InitializeSListHead are functions for lock-free single-linked-list operations under Win32. Use the according operations on other OSes.

    Drawbacks :

    • Overhead of sizeof( SLIST_ENTRY )
    • The buckets can only grow efficiently once at the start, after that you can run out of memory and have to ask the OS/other buckets. (Other buckets leads to fragmentation)
    • This sample is a bit too easy and must be extended to handle more cases
    0 讨论(0)
  • 2021-01-31 13:17

    http://www.research.ibm.com/people/m/michael/pldi-2004.pdf

    This paper presents a completely lock-free memory allocator. It uses only widely-available operating system support and hardware atomic instructions. It offers guaranteed availability even under arbitrary thread termination and crash-failure, and it is immune to dead-lock regardless of scheduling policies, and hence it can be used even in interrupt handlers and real-time applications without requiring special scheduler support. Also, by leveraging some high-level structures from Hoard, our allocator is highly scalable, limits space blowup to a constant factor, and is capable of avoiding false sharing...

    0 讨论(0)
  • 2021-01-31 13:31

    Depends on what you mean by efficiency. If my concern was to make things fast, then I would probably give each thread it's own separate memory pool to work with, and a custom 'malloc' that took memory from that pool. Of course, if my concern was speed, I would probably avoid allocation in the first place.

    There is no one answer; you'll be balancing a range of concerns. It will be pretty much impossible to get a lock-free allocator, but you can either do the locking early and infrequently ( by allocating large pools for each thread ) or you can make the locks so small and tight that they must be correct.

    0 讨论(0)
提交回复
热议问题