Concatenating strings in C, which method is more efficient?

后端 未结 10 894
迷失自我
迷失自我 2020-11-28 19:51

I came across these two methods to concatenate strings:

Common part:

char* first= \"First\";
char* second = \"Second\";
char* both = malloc(strlen(fi         


        
相关标签:
10条回答
  • 2020-11-28 20:44

    Here's some madness for you, I actually went and measured it. Bloody hell, imagine that. I think I got some meaningful results.

    I used a dual core P4, running Windows, using mingw gcc 4.4, building with "gcc foo.c -o foo.exe -std=c99 -Wall -O2".

    I tested method 1 and method 2 from the original post. Initially kept the malloc outside the benchmark loop. Method 1 was 48 times faster than method 2. Bizarrely, removing -O2 from the build command made the resulting exe 30% faster (haven't investigated why yet).

    Then I added a malloc and free inside the loop. That slowed down method 1 by a factor of 4.4. Method 2 slowed down by a factor of 1.1.

    So, malloc + strlen + free DO NOT dominate the profile enough to make avoiding sprintf worth while.

    Here's the code I used (apart from the loops were implemented with < instead of != but that broke the HTML rendering of this post):

    void a(char *first, char *second, char *both)
    {
        for (int i = 0; i != 1000000 * 48; i++)
        {
            strcpy(both, first);
            strcat(both, " ");
            strcat(both, second);
        }
    }
    
    void b(char *first, char *second, char *both)
    {
        for (int i = 0; i != 1000000 * 1; i++)
            sprintf(both, "%s %s", first, second);
    }
    
    int main(void)
    {
        char* first= "First";
        char* second = "Second";
        char* both = (char*) malloc((strlen(first) + strlen(second) + 2) * sizeof(char));
    
        // Takes 3.7 sec with optimisations, 2.7 sec WITHOUT optimisations!
        a(first, second, both);
    
        // Takes 3.7 sec with or without optimisations
        //b(first, second, both);
    
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-28 20:46
    size_t lf = strlen(first);
    size_t ls = strlen(second);
    
    char *both = (char*) malloc((lf + ls + 2) * sizeof(char));
    
    strcpy(both, first);
    
    both[lf] = ' ';
    strcpy(&both[lf+1], second);
    
    0 讨论(0)
  • 2020-11-28 20:46

    sprintf() is designed to handle far more than just strings, strcat() is specialist. But I suspect that you are sweating the small stuff. C strings are fundamentally inefficient in ways that make the differences between these two proposed methods insignificant. Read "Back to Basics" by Joel Spolsky for the gory details.

    This is an instance where C++ generally performs better than C. For heavy weight string handling using std::string is likely to be more efficient and certainly safer.

    [edit]

    [2nd edit]Corrected code (too many iterations in C string implementation), timings, and conclusion change accordingly

    I was surprised at Andrew Bainbridge's comment that std::string was slower, but he did not post complete code for this test case. I modified his (automating the timing) and added a std::string test. The test was on VC++ 2008 (native code) with default "Release" options (i.e. optimised), Athlon dual core, 2.6GHz. Results:

    C string handling = 0.023000 seconds
    sprintf           = 0.313000 seconds
    std::string       = 0.500000 seconds
    

    So here strcat() is faster by far (your milage may vary depending on compiler and options), despite the inherent inefficiency of the C string convention, and supports my original suggestion that sprintf() carries a lot of baggage not required for this purpose. It remains by far the least readable and safe however, so when performance is not critical, has little merit IMO.

    I also tested a std::stringstream implementation, which was far slower again, but for complex string formatting still has merit.

    Corrected code follows:

    #include <ctime>
    #include <cstdio>
    #include <cstring>
    #include <string>
    
    void a(char *first, char *second, char *both)
    {
        for (int i = 0; i != 1000000; i++)
        {
            strcpy(both, first);
            strcat(both, " ");
            strcat(both, second);
        }
    }
    
    void b(char *first, char *second, char *both)
    {
        for (int i = 0; i != 1000000; i++)
            sprintf(both, "%s %s", first, second);
    }
    
    void c(char *first, char *second, char *both)
    {
        std::string first_s(first) ;
        std::string second_s(second) ;
        std::string both_s(second) ;
    
        for (int i = 0; i != 1000000; i++)
            both_s = first_s + " " + second_s ;
    }
    
    int main(void)
    {
        char* first= "First";
        char* second = "Second";
        char* both = (char*) malloc((strlen(first) + strlen(second) + 2) * sizeof(char));
        clock_t start ;
    
        start = clock() ;
        a(first, second, both);
        printf( "C string handling = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;
    
        start = clock() ;
        b(first, second, both);
        printf( "sprintf           = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;
    
        start = clock() ;
        c(first, second, both);
        printf( "std::string       = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;
    
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-28 20:50

    For readability, I'd go with

    char * s = malloc(snprintf(NULL, 0, "%s %s", first, second) + 1);
    sprintf(s, "%s %s", first, second);
    

    If your platform supports GNU extensions, you could also use asprintf():

    char * s = NULL;
    asprintf(&s, "%s %s", first, second);
    

    If you're stuck with the MS C Runtime, you have to use _scprintf() to determine the length of the resulting string:

    char * s = malloc(_scprintf("%s %s", first, second) + 1);
    sprintf(s, "%s %s", first, second);
    

    The following will most likely be the fastest solution:

    size_t len1 = strlen(first);
    size_t len2 = strlen(second);
    
    char * s = malloc(len1 + len2 + 2);
    memcpy(s, first, len1);
    s[len1] = ' ';
    memcpy(s + len1 + 1, second, len2 + 1); // includes terminating null
    
    0 讨论(0)
提交回复
热议问题