Split string with delimiters in C

前端 未结 20 1417
你的背包
你的背包 2020-11-21 11:56

How do I write a function to split and return an array for a string with delimiters in the C programming language?

char* str = \"JAN,FEB,MAR,APR,MAY,JUN,JUL,         


        
20条回答
  •  粉色の甜心
    2020-11-21 12:17

    I think the following solution is ideal:

    • Doesn't destroy the source string
    • Re-entrant - i.e., you can safely call it from anywhere in one or more threads
    • Portable
    • Handles multiple separators correctly
    • Fast and efficient

    Explanation of the code:

    1. Define a structure token to store the address and lengths of the tokens
    2. Allocate enough memory for these in the worst case, which is when str is made up entirely of separators so there are strlen(str) + 1 tokens, all of them empty strings
    3. Scan str recording the address and length of every token
    4. Use this to allocate the output array of the correct size, including an extra space for a NULL sentinel value
    5. Allocate, copy, and add the tokens using the start and length information - use memcpy as it's faster than strcpy and we know the lengths
    6. Free the token address and length array
    7. Return the array of tokens
    typedef struct {
        const char *start;
        size_t len;
    } token;
    
    char **split(const char *str, char sep)
    {
        char **array;
        unsigned int start = 0, stop, toks = 0, t;
        token *tokens = malloc((strlen(str) + 1) * sizeof(token));
        for (stop = 0; str[stop]; stop++) {
            if (str[stop] == sep) {
                tokens[toks].start = str + start;
                tokens[toks].len = stop - start;
                toks++;
                start = stop + 1;
            }
        }
        /* Mop up the last token */
        tokens[toks].start = str + start;
        tokens[toks].len = stop - start;
        toks++;
        array = malloc((toks + 1) * sizeof(char*));
        for (t = 0; t < toks; t++) {
            /* Calloc makes it nul-terminated */
            char *token = calloc(tokens[t].len + 1, 1);
            memcpy(token, tokens[t].start, tokens[t].len);
            array[t] = token;
        }
        /* Add a sentinel */
        array[t] = NULL; 
        free(tokens);
        return array;
    }

    Note malloc checking omitted for brevity.

    In general, I wouldn't return an array of char * pointers from a split function like this as it places a lot of responsibility on the caller to free them correctly. An interface I prefer is to allow the caller to pass a callback function and call this for every token, as I have described here: Split a String in C.

提交回复
热议问题