C: creating array of strings from delimited source string

前端 未结 5 1431
野的像风
野的像风 2020-12-10 07:52

What would be an efficient way of converting a delimited string into an array of strings in C (not C++)? For example, I might have:

char *input = \"valgrind         


        
相关标签:
5条回答
  • 2020-12-10 08:21

    From the strsep(3) manpage on OSX:

       char **ap, *argv[10], *inputstring;
    
       for (ap = argv; (*ap = strsep(&inputstring, " \t")) != NULL;)
               if (**ap != '\0')
                       if (++ap >= &argv[10])
                               break;
    

    Edited for arbitrary # of tokens:

    char **ap, **argv, *inputstring;
    
    int arglen = 10;
    argv = calloc(arglen, sizeof(char*));
    for (ap = argv; (*ap = strsep(&inputstring, " \t")) != NULL;)
        if (**ap != '\0')
            if (++ap >= &argv[arglen])
            {
                arglen += 10;
                argv = realloc(argv, arglen);
                ap = &argv[arglen-10];
            }
    

    Or something close to that. The above may not work, but if not it's not far off. Building a linked list would be more efficient than continually calling realloc, but that's really besides the point - the point is how best to make use of strsep.

    0 讨论(0)
  • 2020-12-10 08:31

    What's about something like:

    char* string = "valgrind --leak-check=yes --track-origins=yes ./a.out";
    char** args = (char**)malloc(MAX_ARGS*sizeof(char*));
    memset(args, 0, sizeof(char*)*MAX_ARGS);
    
    char* curToken = strtok(string, " \t");
    
    for (int i = 0; curToken != NULL; ++i)
    {
      args[i] = strdup(curToken);
      curToken = strtok(NULL, " \t");
    }
    
    0 讨论(0)
  • 2020-12-10 08:32

    Were you remembering to malloc an extra byte for the terminating null that marks the end of string?

    0 讨论(0)
  • 2020-12-10 08:45

    if you have all of the input in input to begin with then you can never have more tokens than strlen(input). If you don't allow "" as a token, then you can never have more than strlen(input)/2 tokens. So unless input is huge you can safely write.

    char ** myarray = malloc( (strlen(input)/2) * sizeof(char*) );
    
    int NumActualTokens = 0;
    while (char * pToken = get_token_copy(input))
    { 
       myarray[++NumActualTokens] = pToken;
       input = skip_token(input);
    }
    
    char ** myarray = (char**) realloc(myarray, NumActualTokens * sizeof(char*));
    

    As a further optimization, you can keep input around and just replace spaces with \0 and put pointers into the input buffer into myarray[]. No need for a separate malloc for each token unless for some reason you need to free them individually.

    0 讨论(0)
  • 2020-12-10 08:47

    Looking at the other answers, for a beginner in C, it would look complex due to the tight size of code, I thought I would put this in for a beginner, it might be easier to actually parse the string instead of using strtok...something like this:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    
    char **parseInput(const char *str, int *nLen);
    void resizeptr(char ***, int nLen);
    
    int main(int argc, char **argv){
        int maxLen = 0;
        int i = 0;
        char **ptr = NULL;
        char *str = "valgrind --leak-check=yes --track-origins=yes ./a.out";
        ptr = parseInput(str, &maxLen);
        if (!ptr) printf("Error!\n");
        else{
            for (i = 0; i < maxLen; i++) printf("%s\n", ptr[i]);
        }
        for (i = 0; i < maxLen; i++) free(ptr[i]);
        free(ptr);
        return 0;
    }
    
    char **parseInput(const char *str, int *Index){
        char **pStr = NULL;
        char *ptr = (char *)str;
        int charPos = 0, indx = 0;
        while (ptr++ && *ptr){
            if (!isspace(*ptr) && *ptr) charPos++;
            else{
                resizeptr(&ptr, ++indx);
                pStr[indx-1] = (char *)malloc(((charPos+1) * sizeof(char))+1);
                if (!pStr[indx-1]) return NULL;
                strncpy(pStr[indx-1], ptr - (charPos+1), charPos+1);
                pStr[indx-1][charPos+1]='\0';
                charPos = 0;
            }
        }
        if (charPos > 0){
            resizeptr(&pStr, ++indx);
            pStr[indx-1] = (char *)malloc(((charPos+1) * sizeof(char))+1);
            if (!pStr[indx-1]) return NULL;
            strncpy(pStr[indx-1], ptr - (charPos+1), charPos+1);
            pStr[indx-1][charPos+1]='\0';
        }
        *Index = indx;
        return (char **)pStr;
    }
    
    void resizeptr(char ***ptr, int nLen){
        if (*(ptr) == (char **)NULL){
            *(ptr) = (char **)malloc(nLen * sizeof(char*));
            if (!*(ptr)) perror("error!");
        }else{
            char **tmp = (char **)realloc(*(ptr),nLen);
            if (!tmp) perror("error!");
            *(ptr) = tmp;
        }
    }
    

    I slightly modified the code to make it easier. The only string function that I used was strncpy..sure it is a bit long-winded but it does reallocate the array of strings dynamically instead of using a hard-coded MAX_ARGS, which means that the double pointer is already hogging up memory when only 3 or 4 would do, also which would make the memory usage efficient and tiny, by using realloc, the simple parsing is covered by employing isspace, as it iterates using the pointer. When a space is encountered, it reallocates the double pointer, and malloc the offset to hold the string.

    Notice how the triple pointers are used in the resizeptr function.. in fact, I thought this would serve an excellent example of a simple C program, pointers, realloc, malloc, passing-by-reference, basic element of parsing a string...

    Hope this helps, Best regards, Tom.

    0 讨论(0)
提交回复
热议问题