Remove extra white spaces in C++

前端 未结 12 1791
日久生厌
日久生厌 2021-02-05 12:01

I tried to write a script that removes extra white spaces but I didn\'t manage to finish it.

Basically I want to transform abc sssd g g sdg gg gf into

相关标签:
12条回答
  • 2021-02-05 12:05

    There are already plenty of nice solutions. I propose you an alternative based on a dedicated <algorithm> meant to avoid consecutive duplicates: unique_copy():

    void remove_extra_whitespaces(const string &input, string &output)
    {
        output.clear();  // unless you want to add at the end of existing sring...
        unique_copy (input.begin(), input.end(), back_insert_iterator<string>(output),
                                         [](char a,char b){ return isspace(a) && isspace(b);});  
        cout << output<<endl; 
    }
    

    Here is a live demo. Note that I changed from c style strings to the safer and more powerful C++ strings.

    Edit: if keeping c-style strings is required in your code, you could use almost the same code but with pointers instead of iterators. That's the magic of C++. Here is another live demo.

    0 讨论(0)
  • 2021-02-05 12:13

    Since you are writing c-style, here's a way to do what you want. Note that you can remove '\r' and '\n' which are line breaks (but of course that's up to you if you consider those whitespaces or not).

    This function should be as fast or faster than any other alternative and no memory allocation takes place even when it's called with std::strings (I've overloaded it).

    char temp[] = " alsdasdl   gasdasd  ee";
    remove_whitesaces(temp);
    printf("%s\n", temp);
    
    int remove_whitesaces(char *p)
    {
        int len = strlen(p);
        int new_len = 0;
        bool space = false;
    
        for (int i = 0; i < len; i++)
        {
            switch (p[i])
            {
            case ' ': space = true;  break;
            case '\t': space = true;  break;
            case '\n': break; // you could set space true for \r and \n
            case '\r': break; // if you consider them spaces, I just ignore them.
            default:
                if (space && new_len > 0)
                    p[new_len++] = ' ';
                p[new_len++] = p[i];
                space = false;
            }
        }
    
        p[new_len] = '\0';
    
        return new_len;
    }
    
    // and you can use it with strings too,
    
    inline int remove_whitesaces(std::string &str)
    {
        int len = remove_whitesaces(&str[0]);
        str.resize(len);
        return len; // returning len for consistency with the primary function
                    // but u can return std::string instead.
    }
    
    // again no memory allocation is gonna take place,
    // since resize does not not free memory because the length is either equal or lower
    

    If you take a brief look at the C++ Standard library, you will notice that a lot C++ functions that return std::string, or other std::objects are basically a wrapper to a well written extern "C" function. So don't be afraid to use C functions in C++ applications, if they are well written and you can overload them to support std::strings and such.

    For example, in Visual Studio 2015, std::to_string is written exactly like this:

    inline string to_string(int _Val)
        {   // convert int to string
        return (_Integral_to_string("%d", _Val));
        }
    
    inline string to_string(unsigned int _Val)
        {   // convert unsigned int to string
        return (_Integral_to_string("%u", _Val));
        }
    

    and _Integral_to_string is a wrapper to a C function sprintf_s

    template<class _Ty> inline
        string _Integral_to_string(const char *_Fmt, _Ty _Val)
        {   // convert _Ty to string
        static_assert(is_integral<_Ty>::value,
            "_Ty must be integral");
        char _Buf[_TO_STRING_BUF_SIZE];
        int _Len = _CSTD sprintf_s(_Buf, _TO_STRING_BUF_SIZE, _Fmt, _Val);
        return (string(_Buf, _Len));
        }
    
    0 讨论(0)
  • 2021-02-05 12:14

    Here's a simple, non-C++11 solution, using the same remove_extra_whitespace() signature as in the question:

    #include <cstdio>
    
    void remove_extra_whitespaces(char* input, char* output)
    {
        int inputIndex = 0;
        int outputIndex = 0;
        while(input[inputIndex] != '\0')
        {
            output[outputIndex] = input[inputIndex];
    
            if(input[inputIndex] == ' ')
            {
                while(input[inputIndex + 1] == ' ')
                {
                    // skip over any extra spaces
                    inputIndex++;
                }
            }
    
            outputIndex++;
            inputIndex++;
        }
    
        // null-terminate output
        output[outputIndex] = '\0';
    }
    
    int main(int argc, char **argv)
    {
        char input[0x255] = "asfa sas    f f dgdgd  dg   ggg";
        char output[0x255] = "NO_OUTPUT_YET";
        remove_extra_whitespaces(input,output);
    
        printf("input: %s\noutput: %s\n", input, output);
    
        return 1;
    }
    

    Output:

    input: asfa sas    f f dgdgd  dg   ggg
    output: asfa sas f f dgdgd dg ggg
    
    0 讨论(0)
  • 2021-02-05 12:15

    Simple program to remove extra white spaces without using any inbuilt functions.

    #include<iostream>
    #include<string.h>
    #include<stdio.h>
    using namespace std;
    
    int main()
    {
      char str[1200];
      int i,n,j,k, pos = 0 ;
      cout<<"Enter string:\n";
      gets(str);
      n = strlen(str);
      for(i =0;i<=n;i++)
      {
          if(str[i] == ' ')
          {
              for(j= i+1;j<=n;j++)
              {
                      if(str[j] != ' ')
                      {
                          pos = j;
                          break;
                      }
               }
             if(pos != 0 && str[pos] != ' ')
             {
                for(k =i+1;k< pos;k++)
                 {   if(str[pos] == ' ')
                         break;
                     else{
                        str[k] = str[pos];
                        str[pos] = ' ';
                        pos++;
                     }
    
                 }
             }
    
          }
      }
      puts(str); 
    }
    
    0 讨论(0)
  • 2021-02-05 12:18

    for in-place modification you can apply erase-remove technic:

    #include <string>
    #include <iostream>
    #include <algorithm>
    #include <cctype>
    
    int main()
    {
        std::string input {"asfa sas    f f dgdgd  dg   ggg"};
        bool prev_is_space = true;
        input.erase(std::remove_if(input.begin(), input.end(), [&prev_is_space](unsigned char curr) {
            bool r = std::isspace(curr) && prev_is_space;
            prev_is_space = std::isspace(curr);
            return r;
    
        }), input.end());
    
        std::cout << input << "\n";
    }
    

    So you first move all extra spaces to the end of the string and then truncate it.


    The great advantage of C++ is that is universal enough to port your code to plain-c-static strings with only few modifications:

    void erase(char * p) {
        // note that this ony works good when initial array is allocated in the static array
        // so we do not need to rearrange memory
        *p = 0; 
    }
    
    int main()
    {
        char input [] {"asfa sas    f f dgdgd  dg   ggg"};
        bool prev_is_space = true;
        erase(std::remove_if(std::begin(input), std::end(input), [&prev_is_space](unsigned char curr) {
            bool r = std::isspace(curr) && prev_is_space;
            prev_is_space = std::isspace(curr);
            return r;
    
        }));
    
        std::cout << input << "\n";
    }
    

    Interesting enough remove step here is string-representation independent. It will work with std::string without modifications at all.

    0 讨论(0)
  • 2021-02-05 12:19

    I don't know if this helps but this is how I did it on my homework. The only case where it might break a bit is when there is spaces at the beginning of the string EX " wor ds " In that case, it will change it to " wor ds"

    void ShortenSpace(string &usrStr){
       char cha1;
       char cha2;
       for (int i = 0; i < usrStr.size() - 1; ++i) {
          cha1 = usrStr.at(i);
          cha2 = usrStr.at(i + 1);
          
          if ((cha1 == ' ') && (cha2 == ' ')) {
             usrStr.erase(usrStr.begin() + 1 + i);
             --i;//edit: was ++i instead of --i, made code not work properly
          }
       }
    }
    
    0 讨论(0)
提交回复
热议问题