How do I trim leading/trailing whitespace in a standard way?

后端 未结 30 1962
一个人的身影
一个人的身影 2020-11-22 02:06

Is there a clean, preferably standard method of trimming leading and trailing whitespace from a string in C? I\'d roll my own, but I would think this is a common problem wit

相关标签:
30条回答
  • 2020-11-22 02:36

    I'm only including code because the code posted so far seems suboptimal (and I don't have the rep to comment yet.)

    void inplace_trim(char* s)
    {
        int start, end = strlen(s);
        for (start = 0; isspace(s[start]); ++start) {}
        if (s[start]) {
            while (end > 0 && isspace(s[end-1]))
                --end;
            memmove(s, &s[start], end - start);
        }
        s[end - start] = '\0';
    }
    
    char* copy_trim(const char* s)
    {
        int start, end;
        for (start = 0; isspace(s[start]); ++start) {}
        for (end = strlen(s); end > 0 && isspace(s[end-1]); --end) {}
        return strndup(s + start, end - start);
    }
    

    strndup() is a GNU extension. If you don't have it or something equivalent, roll your own. For example:

    r = strdup(s + start);
    r[end-start] = '\0';
    
    0 讨论(0)
  • 2020-11-22 02:37

    Here's a solution similar to @adam-rosenfields in-place modification routine but without needlessly resorting to strlen(). Like @jkramer, the string is left-adjusted within the buffer so you can free the same pointer. Not optimal for large strings since it does not use memmove. Includes the ++/-- operators that @jfm3 mentions. FCTX-based unit tests included.

    #include <ctype.h>
    
    void trim(char * const a)
    {
        char *p = a, *q = a;
        while (isspace(*q))            ++q;
        while (*q)                     *p++ = *q++;
        *p = '\0';
        while (p > a && isspace(*--p)) *p = '\0';
    }
    
    /* See http://fctx.wildbearsoftware.com/ */
    #include "fct.h"
    
    FCT_BGN()
    {
        FCT_QTEST_BGN(trim)
        {
            { char s[] = "";      trim(s); fct_chk_eq_str("",    s); } // Trivial
            { char s[] = "   ";   trim(s); fct_chk_eq_str("",    s); } // Trivial
            { char s[] = "\t";    trim(s); fct_chk_eq_str("",    s); } // Trivial
            { char s[] = "a";     trim(s); fct_chk_eq_str("a",   s); } // NOP
            { char s[] = "abc";   trim(s); fct_chk_eq_str("abc", s); } // NOP
            { char s[] = "  a";   trim(s); fct_chk_eq_str("a",   s); } // Leading
            { char s[] = "  a c"; trim(s); fct_chk_eq_str("a c", s); } // Leading
            { char s[] = "a  ";   trim(s); fct_chk_eq_str("a",   s); } // Trailing
            { char s[] = "a c  "; trim(s); fct_chk_eq_str("a c", s); } // Trailing
            { char s[] = " a ";   trim(s); fct_chk_eq_str("a",   s); } // Both
            { char s[] = " a c "; trim(s); fct_chk_eq_str("a c", s); } // Both
    
            // Villemoes pointed out an edge case that corrupted memory.  Thank you.
            // http://stackoverflow.com/questions/122616/#comment23332594_4505533
            {
              char s[] = "a     ";       // Buffer with whitespace before s + 2
              trim(s + 2);               // Trim "    " containing only whitespace
              fct_chk_eq_str("", s + 2); // Ensure correct result from the trim
              fct_chk_eq_str("a ", s);   // Ensure preceding buffer not mutated
            }
    
            // doukremt suggested I investigate this test case but
            // did not indicate the specific behavior that was objectionable.
            // http://stackoverflow.com/posts/comments/33571430
            {
              char s[] = "         foobar";  // Shifted across whitespace
              trim(s);                       // Trim
              fct_chk_eq_str("foobar", s);   // Leading string is correct
    
              // Here is what the algorithm produces:
              char r[16] = { 'f', 'o', 'o', 'b', 'a', 'r', '\0', ' ',                     
                             ' ', 'f', 'o', 'o', 'b', 'a', 'r', '\0'};
              fct_chk_eq_int(0, memcmp(s, r, sizeof(s)));
            }
        }
        FCT_QTEST_END();
    }
    FCT_END();
    
    0 讨论(0)
  • 2020-11-22 02:37

    I know there have many answers, but I post my answer here to see if my solution is good enough.

    // Trims leading whitespace chars in left `str`, then copy at almost `n - 1` chars
    // into the `out` buffer in which copying might stop when the first '\0' occurs, 
    // and finally append '\0' to the position of the last non-trailing whitespace char.
    // Reture the length the trimed string which '\0' is not count in like strlen().
    size_t trim(char *out, size_t n, const char *str)
    {
        // do nothing
        if(n == 0) return 0;    
    
        // ptr stop at the first non-leading space char
        while(isspace(*str)) str++;    
    
        if(*str == '\0') {
            out[0] = '\0';
            return 0;
        }    
    
        size_t i = 0;    
    
        // copy char to out until '\0' or i == n - 1
        for(i = 0; i < n - 1 && *str != '\0'; i++){
            out[i] = *str++;
        }    
    
        // deal with the trailing space
        while(isspace(out[--i]));    
    
        out[++i] = '\0';
        return i;
    }
    
    0 讨论(0)
  • 2020-11-22 02:38

    Another one, with one line doing the real job:

    #include <stdio.h>
    
    int main()
    {
       const char *target = "   haha   ";
       char buf[256];
       sscanf(target, "%s", buf); // Trimming on both sides occurs here
       printf("<%s>\n", buf);
    }
    
    0 讨论(0)
  • 2020-11-22 02:38

    Personally, I'd roll my own. You can use strtok, but you need to take care with doing so (particularly if you're removing leading characters) that you know what memory is what.

    Getting rid of trailing spaces is easy, and pretty safe, as you can just put a 0 in over the top of the last space, counting back from the end. Getting rid of leading spaces means moving things around. If you want to do it in place (probably sensible) you can just keep shifting everything back one character until there's no leading space. Or, to be more efficient, you could find the index of the first non-space character, and shift everything back by that number. Or, you could just use a pointer to the first non-space character (but then you need to be careful in the same way as you do with strtok).

    0 讨论(0)
  • 2020-11-22 02:38
    #include "stdafx.h"
    #include "malloc.h"
    #include "string.h"
    
    int main(int argc, char* argv[])
    {
    
      char *ptr = (char*)malloc(sizeof(char)*30);
      strcpy(ptr,"            Hel  lo    wo           rl   d G    eo rocks!!!    by shahil    sucks b i          g       tim           e");
    
      int i = 0, j = 0;
    
      while(ptr[j]!='\0')
      {
    
          if(ptr[j] == ' ' )
          {
              j++;
              ptr[i] = ptr[j];
          }
          else
          {
              i++;
              j++;
              ptr[i] = ptr[j];
          }
      }
    
    
      printf("\noutput-%s\n",ptr);
            return 0;
    }
    0 讨论(0)
提交回复
热议问题