Fastest way to remove chars from string

前端 未结 7 1640
生来不讨喜
生来不讨喜 2020-12-05 13:52

I have a string from which I have to remove following char: \'\\r\', \'\\n\', and \'\\t\'. I have tried three different ways of removing these char and benchmarked them so I

相关标签:
7条回答
  • 2020-12-05 14:29

    Here's the uber-fast unsafe version, version 2.

        public static unsafe string StripTabsAndNewlines(string s)
        {
            int len = s.Length;
            char* newChars = stackalloc char[len];
            char* currentChar = newChars;
    
            for (int i = 0; i < len; ++i)
            {
                char c = s[i];
                switch (c)
                {
                    case '\r':
                    case '\n':
                    case '\t':
                        continue;
                    default:
                        *currentChar++ = c;
                        break;
                }
            }
            return new string(newChars, 0, (int)(currentChar - newChars));
        }
    

    And here are the benchmarks (time to strip 1000000 strings in ms)

        cornerback84's String.Replace:         9433
        Andy West's String.Concat:             4756
        AviJ's char array:                     1374
        Matt Howells' char pointers:           1163
    0 讨论(0)
  • 2020-12-05 14:31

    Even faster:

    public static string RemoveMultipleWhiteSpaces(string s)
        {
            char[] sResultChars = new char[s.Length];
    
            bool isWhiteSpace = false;
            int sResultCharsIndex = 0;
    
            for (int i = 0; i < s.Length; i++)
            {
                if (s[i] == ' ')
                {
                    if (!isWhiteSpace)
                    {
                        sResultChars[sResultCharsIndex] = s[i];
                        sResultCharsIndex++;
                        isWhiteSpace = true;
                    }
                }
                else
                {
                    sResultChars[sResultCharsIndex] = s[i];
                    sResultCharsIndex++;
                    isWhiteSpace = false;
                }
            }
    
            return new string(sResultChars, 0, sResultCharsIndex);
        }
    
    0 讨论(0)
  • 2020-12-05 14:44

    Looping through the string and using (just one) StringBuilder (with the proper constructor argument, to avoid unnecessary memory allocations) to create a new string could be faster.

    0 讨论(0)
  • 2020-12-05 14:46

    I believe you'll get the best possible performance by composing the new string as a char array and only convert it to a string when you're done, like so:

    string s = "abc";
    int len = s.Length;
    char[] s2 = new char[len];
    int i2 = 0;
    for (int i = 0; i < len; i++)
    {
        char c = s[i];
        if (c != '\r' && c != '\n' && c != '\t')
            s2[i2++] = c;
    }
    return new String(s2, 0, i2);
    

    EDIT: using String(s2, 0, i2) instead of Trim(), per suggestion

    0 讨论(0)
  • 2020-12-05 14:46
    String.Join(null, str.Split(new char[] { '\t', '\r', '\n' },
        StringSplitOptions.None));
    

    might give you a performance increase over using Aggregate() since Join() is designed for strings.

    EDIT:

    Actually, this might be even better:

    String.Concat(str.Split(new char[] { '\t', '\r', '\n' },
        StringSplitOptions.None));
    
    0 讨论(0)
  • 2020-12-05 14:46

    try this

    string str = "something \tis \nbetter than nothing";
    string removeChars = new String(new Char[]{'\n', '\t'});
    string newStr = new string(str.ToCharArray().Where(c => !removeChars.Contains(c)).ToArray());
    
    0 讨论(0)
提交回复
热议问题