How to remove redundant spaces/whitespace from a string in Golang?

后端 未结 5 2053
借酒劲吻你
借酒劲吻你 2021-02-02 12:44

I was wondering how to remove:

  • All leading/trailing whitespace or new-line characters, null characters, etc.
  • Any redundant spaces within
相关标签:
5条回答
  • 2021-02-02 13:23

    Avoiding to use time wasting regexp or external library
    I've choose to use plain golang instead of regexp, cause there are special character that are not ASCII in every language.

    Go Golang!

    func RemoveDoubleWhiteSpace(str string) string {
        var b strings.Builder
        b.Grow(len(str))
        for i := range str {
            if !(str[i] == 32 && (i+1 < len(str) && str[i+1] == 32)) {
                b.WriteRune(rune(str[i]))
            }
        }
        return b.String()
    }
    

    And the related test

    func TestRemoveDoubleWhiteSpace(t *testing.T) {
        data := []string{`  test`, `test  `, `te  st`}
        for _, item := range data {
            str := RemoveDoubleWhiteSpace(item)
            t.Log("Data ->|"+item+"|Found: |"+str+"| Len: ", len(str))
            if len(str) != 5 {
                t.Fail()
            }
        }
    }
    
    0 讨论(0)
  • 2021-02-02 13:25

    It seems that you might want to use both \s shorthand character class and \p{Zs} Unicode property to match Unicode spaces. However, both steps cannot be done with 1 regex replacement as you need two different replacements, and the ReplaceAllStringFunc only allows a whole match string as argument (I have no idea how to check which group matched).

    Thus, I suggest using two regexps:

    • ^[\s\p{Zs}]+|[\s\p{Zs}]+$ - to match all leading/trailing whitespace
    • [\s\p{Zs}]{2,} - to match 2 or more whitespace symbols inside a string

    Sample code:

    package main
    
    import (
        "fmt"
        "regexp"
    )
    
    func main() {
        input := "   Text   More here     "
        re_leadclose_whtsp := regexp.MustCompile(`^[\s\p{Zs}]+|[\s\p{Zs}]+$`)
        re_inside_whtsp := regexp.MustCompile(`[\s\p{Zs}]{2,}`)
        final := re_leadclose_whtsp.ReplaceAllString(input, "")
        final = re_inside_whtsp.ReplaceAllString(final, " ")
        fmt.Println(final)
    }
    
    0 讨论(0)
  • 2021-02-02 13:27

    You can get quite far just using the strings package as strings.Fields does most of the work for you:

    package main
    
    import (
        "fmt"
        "strings"
    )
    
    func standardizeSpaces(s string) string {
        return strings.Join(strings.Fields(s), " ")
    }
    
    func main() {
        tests := []string{" Hello,   World  ! ", "Hello,\tWorld ! ", " \t\n\t Hello,\tWorld\n!\n\t"}
        for _, test := range tests {
            fmt.Println(standardizeSpaces(test))
        }
    }
    // "Hello, World !"
    // "Hello, World !"
    // "Hello, World !"
    
    0 讨论(0)
  • 2021-02-02 13:27

    Use regexp for this.

    func main() {
        data := []byte("   Hello,   World !   ")
        re := regexp.MustCompile("  +")
        replaced := re.ReplaceAll(bytes.TrimSpace(data), []byte(" "))
        fmt.Println(string(replaced))
        // Hello, World !
    }
    

    In order to also trim newlines and null characters, you can use the bytes.Trim(src []byte, cutset string) function instead of bytes.TrimSpace

    0 讨论(0)
  • 2021-02-02 13:48

    strings.Fields() splits on any amount of white space, thus:

    strings.Join(strings.Fields(strings.TrimSpace(s)), " ")
    
    0 讨论(0)
提交回复
热议问题