Remove extra white spaces in C++

前端 未结 12 1793
日久生厌
日久生厌 2021-02-05 12:01

I tried to write a script that removes extra white spaces but I didn\'t manage to finish it.

Basically I want to transform abc sssd g g sdg gg gf into

相关标签:
12条回答
  • 2021-02-05 12:20

    Well here is a longish(but easy) solution that does not use pointers. It can be optimized further but hey it works.

    #include <iostream>
    #include <string>
    using namespace std;
    void removeExtraSpace(string str);
    int main(){
        string s;
        cout << "Enter a string with extra spaces: ";
        getline(cin, s);
        removeExtraSpace(s);
        return 0;
    }
    void removeExtraSpace(string str){
        int len = str.size();
        if(len==0){
            cout << "Simplified String: " << endl;
            cout << "I would appreciate it if you could enter more than 0 characters. " << endl;
            return;
        }
        char ch1[len];
        char ch2[len];
        //Placing characters of str in ch1[]
        for(int i=0; i<len; i++){
            ch1[i]=str[i];
        }
        //Computing index of 1st non-space character
        int pos=0;
        for(int i=0; i<len; i++){
            if(ch1[i] != ' '){
                pos = i;
                break;
            }
        }
        int cons_arr = 1;
        ch2[0] = ch1[pos];
        for(int i=(pos+1); i<len; i++){
            char x = ch1[i];
            if(x==char(32)){
                //Checking whether character at ch2[i]==' '
                if(ch2[cons_arr-1] == ' '){
                    continue;
                }
                else{
                    ch2[cons_arr] = ' ';
                    cons_arr++;
                    continue;
                }
            }
            ch2[cons_arr] = x;
            cons_arr++;
        }
        //Printing the char array
        cout << "Simplified string: " << endl;
        for(int i=0; i<cons_arr; i++){
            cout << ch2[i];
        }
        cout << endl;
    }
    
    0 讨论(0)
  • 2021-02-05 12:22

    Since you use C++, you can take advantage of standard-library features designed for that sort of work. You could use std::string (instead of char[0x255]) and std::istringstream, which will replace most of the pointer arithmetic.

    First, make a string stream:

    std::istringstream stream(input);
    

    Then, read strings from it. It will remove the whitespace delimiters automatically:

    std::string word;
    while (stream >> word)
    {
        ...
    }
    

    Inside the loop, build your output string:

        if (!output.empty()) // special case: no space before first word
            output += ' ';
        output += word;
    

    A disadvantage of this method is that it allocates memory dynamically (including several reallocations, performed when the output string grows).

    0 讨论(0)
  • 2021-02-05 12:24

    There are plenty of ways of doing this (e.g., using regular expressions), but one way you could do this is using std::copy_if with a stateful functor remembering whether the last character was a space:

    #include <algorithm>
    #include <string>
    #include <iostream>
    
    struct if_not_prev_space
    {
        // Is last encountered character space.
        bool m_is = false;
    
        bool operator()(const char c)
        {                                      
            // Copy if last was not space, or current is not space.                                                                                                                                                              
            const bool ret = !m_is || c != ' ';
            m_is = c == ' ';
            return ret;
        }
    };
    
    
    int main()
    {
        const std::string s("abc  sssd g g sdg    gg  gf into abc sssd g g sdg gg gf");
        std::string o;
        std::copy_if(std::begin(s), std::end(s), std::back_inserter(o), if_not_prev_space());
        std::cout << o << std::endl;
    }
    
    0 讨论(0)
  • 2021-02-05 12:25

    You can use std::unique which reduces adjacent duplicates to a single instance according to how you define what makes two elements equal is.

    Here I have defined elements as equal if they are both whitespace characters:

    inline std::string& remove_extra_ws_mute(std::string& s)
    {
        s.erase(std::unique(std::begin(s), std::end(s), [](unsigned char a, unsigned char b){
            return std::isspace(a) && std::isspace(b);
        }), std::end(s));
    
        return s;
    }
    
    inline std::string remove_extra_ws_copy(std::string s)
    {
        return remove_extra_ws_mute(s);
    }
    

    std::unique moves the duplicates to the end of the string and returns an iterator to the beginning of them so they can be erased.

    Additionally, if you must work with low level strings then you can still use std::unique on the pointers:

    char* remove_extra_ws(char const* s)
    {
        std::size_t len = std::strlen(s);
    
        char* buf = new char[len + 1];
        std::strcpy(buf, s);
    
        // Note that std::unique will also retain the null terminator
        // in its correct position at the end of the valid portion
        // of the string    
        std::unique(buf, buf + len + 1, [](unsigned char a, unsigned char b){
            return (a && std::isspace(a)) && (b && std::isspace(b));
        });
    
        return buf;
    }
    
    0 讨论(0)
  • 2021-02-05 12:28

    I ended up here for a slighly different problem. Since I don't know where else to put it, and I found out what was wrong, I share it here. Don't be cross with me, please. I had some strings that would print additional spaces at their ends, while showing up without spaces in debugging. The strings where formed in windows calls like VerQueryValue(), which besides other stuff outputs a string length, as e.g. iProductNameLen in the following line converting the result to a string named strProductName:

        strProductName = string((LPCSTR)pvProductName, iProductNameLen)
    

    then produced a string with a \0 byte at the end, which did not show easily in de debugger, but printed on screen as a space. I'll leave the solution of this as an excercise, since it is not hard at all, once you are aware of this.

    0 讨论(0)
  • 2021-02-05 12:30

    I have the sinking feeling that good ol' scanf will do (in fact, this is the C school equivalent to Anatoly's C++ solution):

    void remove_extra_whitespaces(char* input, char* output)
    {
        int srcOffs = 0, destOffs = 0, numRead = 0;
    
        while(sscanf(input + srcOffs, "%s%n", output + destOffs, &numRead) > 0)
        {
            srcOffs += numRead;
            destOffs += strlen(output + destOffs);
            output[destOffs++] = ' '; // overwrite 0, advance past that
        }
        output[destOffs > 0 ? destOffs-1 : 0] = '\0';
    }
    

    We exploit the fact that scanf has magical built-in space skipping capabilities. We then use the perhaps less known %n "conversion" specification which gives us the amount of chars consumed by scanf. This feature frequently comes in handy when reading from strings, like here. The bitter drop which makes this solution less-than-perfect is the strlen call on the output (there is no "how many bytes have I actually just written" conversion specifier, unfortunately).

    Last not least use of scanf is easy here because sufficient memory is guaranteed to exist at output; if that were not the case, the code would become more complex due to buffering and overflow handling.

    0 讨论(0)
提交回复
热议问题