How do you implement a circular buffer in C?

后端 未结 8 1960
滥情空心
滥情空心 2020-11-28 00:53

I have a need for a fixed-size (selectable at run-time when creating it, not compile-time) circular buffer which can hold objects of any type and it needs to be very

相关标签:
8条回答
  • 2020-11-28 01:35

    First, the headline. You don't need modulo arithmetic to wrap the buffer if you use bit ints to hold the head & tail "pointers", and size them so they are perfectly in synch. IE: 4096 stuffed into a 12-bit unsigned int is 0 all by itself, unmolested in any way. Eliminating modulo arithmetic, even for powers of 2, doubles the speed - almost exactly.

    10 million iterations of filling and draining a 4096 buffer of any type of data elements takes 52 seconds on my 3rd Gen i7 Dell XPS 8500 using Visual Studio 2010's C++ compiler with default inlining, and 1/8192nd of that to service a datum.

    I'd RX rewriting the test loops in main() so they no longer control the flow - which is, and should be, controlled by the return values indicating the buffer is full or empty, and the attendant break; statements. IE: the filler and drainer should be able to bang against each other without corruption or instability. At some point I hope to multi-thread this code, whereupon that behavior will be crucial.

    The QUEUE_DESC (queue descriptor) and initialization function forces all buffers in this code to be a power of 2. The above scheme will NOT work otherwise. While on the subject, note that QUEUE_DESC is not hard-coded, it uses a manifest constant (#define BITS_ELE_KNT) for its construction. (I'm assuming a power of 2 is sufficient flexibility here)

    To make the buffer size run-time selectable, I tried different approaches (not shown here), and settled on using USHRTs for Head, Tail, EleKnt capable of managing a FIFO buffer[USHRT]. To avoid modulo arithmetic I created a mask to && with Head, Tail, but that mask turns out to be (EleKnt -1), so just use that. Using USHRTS instead of bit ints increased performance ~ 15% on a quiet machine. Intel CPU cores have always been faster than their buses, so on a busy, shared machine, packing your data structures gets you loaded and executing ahead of other, competing threads. Trade-offs.

    Note the actual storage for the buffer is allocated on the heap with calloc(), and the pointer is at the base of the struct, so the struct and the pointer have EXACTLY the same address. IE; no offset required to be added to the struct address to tie up registers.

    In that same vein, all of the variables attendant with servicing the buffer are physically adjacent to the buffer, bound into the same struct, so the compiler can make beautiful assembly language. You'll have to kill the inline optimization to see any assembly, because otherwise it gets crushed into oblivion.

    To support the polymorphism of any data type, I've used memcpy() instead of assignments. If you only need the flexibility to support one random variable type per compile, then this code works perfectly.

    For polymorphism, you just need to know the type and it's storage requirement. The DATA_DESC array of descriptors provides a way to keep track of each datum that gets put in QUEUE_DESC.pBuffer so it can be retrieved properly. I'd just allocate enough pBuffer memory to hold all of the elements of the largest data type, but keep track of how much of that storage a given datum is actually using in DATA_DESC.dBytes. The alternative is to reinvent a heap manager.

    This means QUEUE_DESC's UCHAR *pBuffer would have a parallel companion array to keep track of data type, and size, while a datum's storage location in pBuffer would remain just as it is now. The new member would be something like DATA_DESC *pDataDesc, or, perhaps, DATA_DESC DataDesc[2^BITS_ELE_KNT] if you can find a way to beat your compiler into submission with such a forward reference. Calloc() is always more flexible in these situations.

    You'd still memcpy() in Q_Put(),Q_Get, but the number of bytes actually copied would be determined by DATA_DESC.dBytes, not QUEUE_DESC.EleBytes. The elements are potentially all of different types/sizes for any given put or get.

    I believe this code satisfies the speed and buffer size requirements, and can be made to satisfy the requirement for 6 different data types. I've left the many test fixtures in, in the form of printf() statements, so you can satisfy yourself (or not) that the code works properly. The random number generator demonstrates that the code works for any random head/tail combo.

    enter code here
    // Queue_Small.cpp : Defines the entry point for the console application.
    //
    #include "stdafx.h"
    #include <stdio.h>
    #include <time.h>
    #include <limits.h>
    #include <stdlib.h>
    #include <malloc.h>
    #include <memory.h>
    #include <math.h>
    
    #define UCHAR unsigned char
    #define ULONG unsigned long
    #define USHRT unsigned short
    #define dbl   double
    /* Queue structure */
    #define QUEUE_FULL_FLAG 1
    #define QUEUE_EMPTY_FLAG -1
    #define QUEUE_OK 0
    //  
    #define BITS_ELE_KNT    12  //12 bits will create 4.096 elements numbered 0-4095
    //
    //typedef struct    {
    //  USHRT dBytes:8;     //amount of QUEUE_DESC.EleBytes storage used by datatype
    //  USHRT dType :3; //supports 8 possible data types (0-7)
    //  USHRT dFoo  :5; //unused bits of the unsigned short host's storage
    // }    DATA_DESC;
    //  This descriptor gives a home to all the housekeeping variables
    typedef struct  {
        UCHAR   *pBuffer;   //  pointer to storage, 16 to 4096 elements
        ULONG Tail  :BITS_ELE_KNT;  //  # elements, with range of 0-4095
        ULONG Head  :BITS_ELE_KNT;  //  # elements, with range of 0-4095
        ULONG EleBytes  :8;     //  sizeof(elements) with range of 0-256 bytes
        // some unused bits will be left over if BITS_ELE_KNT < 12
        USHRT EleKnt    :BITS_ELE_KNT +1;// 1 extra bit for # elements (1-4096)
        //USHRT Flags   :(8*sizeof(USHRT) - BITS_ELE_KNT +1);   //  flags you can use
        USHRT   IsFull  :1;     // queue is full
        USHRT   IsEmpty :1;     // queue is empty
        USHRT   Unused  :1;     // 16th bit of USHRT
    }   QUEUE_DESC;
    
    //  ---------------------------------------------------------------------------
    //  Function prototypes
    QUEUE_DESC *Q_Init(QUEUE_DESC *Q, int BitsForEleKnt, int DataTypeSz);
    int Q_Put(QUEUE_DESC *Q, UCHAR *pNew);
    int Q_Get(UCHAR *pOld, QUEUE_DESC *Q);
    //  ---------------------------------------------------------------------------
    QUEUE_DESC *Q_Init(QUEUE_DESC *Q, int BitsForEleKnt, int DataTypeSz)    {
        memset((void *)Q, 0, sizeof(QUEUE_DESC));//init flags and bit integers to zero
        //select buffer size from powers of 2 to receive modulo 
        //                arithmetic benefit of bit uints overflowing
        Q->EleKnt   =   (USHRT)pow(2.0, BitsForEleKnt);
        Q->EleBytes =   DataTypeSz; // how much storage for each element?
        //  Randomly generated head, tail a test fixture only. 
        //      Demonstrates that the queue can be entered at a random point 
        //      and still perform properly. Normally zero
        srand(unsigned(time(NULL)));    // seed random number generator with current time
        Q->Head = Q->Tail = rand(); // supposed to be set to zero here, or by memset
        Q->Head = Q->Tail = 0;
        //  allocate queue's storage
        if(NULL == (Q->pBuffer = (UCHAR *)calloc(Q->EleKnt, Q->EleBytes)))  {
            return NULL;
        }   else    {
            return Q;
        }
    }
    //  ---------------------------------------------------------------------------
    int Q_Put(QUEUE_DESC *Q, UCHAR *pNew)   
    {
        memcpy(Q->pBuffer + (Q->Tail * Q->EleBytes), pNew, Q->EleBytes);
        if(Q->Tail == (Q->Head + Q->EleKnt)) {
            //  Q->IsFull = 1;
            Q->Tail += 1;   
            return QUEUE_FULL_FLAG; //  queue is full
        }
        Q->Tail += 1;   //  the unsigned bit int MUST wrap around, just like modulo
        return QUEUE_OK; // No errors
    }
    //  ---------------------------------------------------------------------------
    int Q_Get(UCHAR *pOld, QUEUE_DESC *Q)   
    {
        memcpy(pOld, Q->pBuffer + (Q->Head * Q->EleBytes), Q->EleBytes);
        Q->Head += 1;   //  the bit int MUST wrap around, just like modulo
    
        if(Q->Head == Q->Tail)      {
            //  Q->IsEmpty = 1;
            return QUEUE_EMPTY_FLAG; // queue Empty - nothing to get
        }
        return QUEUE_OK; // No errors
    }
    //
    //  ---------------------------------------------------------------------------
    int _tmain(int argc, _TCHAR* argv[])    {
    //  constrain buffer size to some power of 2 to force faux modulo arithmetic
        int LoopKnt = 1000000;  //  for benchmarking purposes only
        int k, i=0, Qview=0;
        time_t start;
        QUEUE_DESC Queue, *Q;
        if(NULL == (Q = Q_Init(&Queue, BITS_ELE_KNT, sizeof(int)))) {
            printf("\nProgram failed to initialize. Aborting.\n\n");
            return 0;
        }
    
        start = clock();
        for(k=0; k<LoopKnt; k++)    {
            //printf("\n\n Fill'er up please...\n");
            //Q->Head = Q->Tail = rand();
            for(i=1; i<= Q->EleKnt; i++)    {
                Qview = i*i;
                if(QUEUE_FULL_FLAG == Q_Put(Q, (UCHAR *)&Qview))    {
                    //printf("\nQueue is full at %i \n", i);
                    //printf("\nQueue value of %i should be %i squared", Qview, i);
                    break;
                }
                //printf("\nQueue value of %i should be %i squared", Qview, i);
            }
            //  Get data from queue until completely drained (empty)
            //
            //printf("\n\n Step into the lab, and see what's on the slab... \n");
            Qview = 0;
            for(i=1; i; i++)    {
                if(QUEUE_EMPTY_FLAG == Q_Get((UCHAR *)&Qview, Q))   {
                    //printf("\nQueue value of %i should be %i squared", Qview, i);
                    //printf("\nQueue is empty at %i", i);
                    break;
                }
                //printf("\nQueue value of %i should be %i squared", Qview, i);
            }
            //printf("\nQueue head value is %i, tail is %i\n", Q->Head, Q->Tail);
        }
        printf("\nQueue time was %5.3f to fill & drain %i element queue  %i times \n", 
                         (dbl)(clock()-start)/(dbl)CLOCKS_PER_SEC,Q->EleKnt, LoopKnt);
        printf("\nQueue head value is %i, tail is %i\n", Q->Head, Q->Tail);
        getchar();
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-28 01:37

    The simplest solution would be to keep track of the item size and the number of items, and then create a buffer of the appropriate number of bytes:

    typedef struct circular_buffer
    {
        void *buffer;     // data buffer
        void *buffer_end; // end of data buffer
        size_t capacity;  // maximum number of items in the buffer
        size_t count;     // number of items in the buffer
        size_t sz;        // size of each item in the buffer
        void *head;       // pointer to head
        void *tail;       // pointer to tail
    } circular_buffer;
    
    void cb_init(circular_buffer *cb, size_t capacity, size_t sz)
    {
        cb->buffer = malloc(capacity * sz);
        if(cb->buffer == NULL)
            // handle error
        cb->buffer_end = (char *)cb->buffer + capacity * sz;
        cb->capacity = capacity;
        cb->count = 0;
        cb->sz = sz;
        cb->head = cb->buffer;
        cb->tail = cb->buffer;
    }
    
    void cb_free(circular_buffer *cb)
    {
        free(cb->buffer);
        // clear out other fields too, just to be safe
    }
    
    void cb_push_back(circular_buffer *cb, const void *item)
    {
        if(cb->count == cb->capacity){
            // handle error
        }
        memcpy(cb->head, item, cb->sz);
        cb->head = (char*)cb->head + cb->sz;
        if(cb->head == cb->buffer_end)
            cb->head = cb->buffer;
        cb->count++;
    }
    
    void cb_pop_front(circular_buffer *cb, void *item)
    {
        if(cb->count == 0){
            // handle error
        }
        memcpy(item, cb->tail, cb->sz);
        cb->tail = (char*)cb->tail + cb->sz;
        if(cb->tail == cb->buffer_end)
            cb->tail = cb->buffer;
        cb->count--;
    }
    
    0 讨论(0)
提交回复
热议问题