Splitting a string at Space, except inside quotation marks

瘦欲@ 提交于 2021-02-16 04:34:38

问题


I was wondering if there is any way I could easily split a string at spaces, except when the space is inside quotation marks?

For example, changing

Foo bar random "letters lol" stuff

into

Foo, bar, random, "letters lol", stuff


回答1:


Think about it. You have a string in comma separated values (CSV) file format, RFC4180, except that your separator, outside quote pairs, is a space (instead of a comma). For example,

package main

import (
    "encoding/csv"
    "fmt"
    "strings"
)

func main() {
    s := `Foo bar random "letters lol" stuff`
    fmt.Printf("String:\n%q\n", s)

    // Split string
    r := csv.NewReader(strings.NewReader(s))
    r.Comma = ' ' // space
    fields, err := r.Read()
    if err != nil {
        fmt.Println(err)
        return
    }

    fmt.Printf("\nFields:\n")
    for _, field := range fields {
        fmt.Printf("%q\n", field)
    }
}

Playground: https://play.golang.org/p/Ed4IV97L7H

Output:

String:
"Foo bar random \"letters lol\" stuff"

Fields:
"Foo"
"bar"
"random"
"letters lol"
"stuff"



回答2:


  1. Using strings.FieldsFunc try this:
package main

import (
    "fmt"
    "strings"
)

func main() {
    s := `Foo bar random "letters lol" stuff`
    quoted := false
    a := strings.FieldsFunc(s, func(r rune) bool {
        if r == '"' {
            quoted = !quoted
        }
        return !quoted && r == ' '
    })

    out := strings.Join(a, ", ")
    fmt.Println(out) // Foo, bar, random, "letters lol", stuff
}

  1. Using simple strings.Builder and range over string and keeping or not keeping " at your will, try this
package main

import (
    "fmt"
    "strings"
)

func main() {
    s := `Foo bar random "letters lol" stuff`
    a := []string{}
    sb := &strings.Builder{}
    quoted := false
    for _, r := range s {
        if r == '"' {
            quoted = !quoted
            sb.WriteRune(r) // keep '"' otherwise comment this line
        } else if !quoted && r == ' ' {
            a = append(a, sb.String())
            sb.Reset()
        } else {
            sb.WriteRune(r)
        }
    }
    if sb.Len() > 0 {
        a = append(a, sb.String())
    }

    out := strings.Join(a, ", ")
    fmt.Println(out) // Foo, bar, random, "letters lol", stuff
    // not keep '"': // Foo, bar, random, letters lol, stuff
}


  1. Using scanner.Scanner, try this:
package main

import (
    "fmt"
    "strings"
    "text/scanner"
)

func main() {
    var s scanner.Scanner
    s.Init(strings.NewReader(`Foo bar random "letters lol" stuff`))
    slice := make([]string, 0, 5)
    tok := s.Scan()
    for tok != scanner.EOF {
        slice = append(slice, s.TokenText())
        tok = s.Scan()
    }
    out := strings.Join(slice, ", ")
    fmt.Println(out) // Foo, bar, random, "letters lol", stuff
}

  1. Using csv.NewReader which removes " itself, try this:
package main

import (
    "encoding/csv"
    "fmt"
    "log"
    "strings"
)

func main() {
    s := `Foo bar random "letters lol" stuff`
    r := csv.NewReader(strings.NewReader(s))
    r.Comma = ' '
    record, err := r.Read()
    if err != nil {
        log.Fatal(err)
    }

    out := strings.Join(record, ", ")
    fmt.Println(out) // Foo, bar, random, letters lol, stuff
}

  1. Using regexp, try this:
package main

import (
    "fmt"
    "regexp"
    "strings"
)

func main() {
    s := `Foo bar random "letters lol" stuff`

    r := regexp.MustCompile(`[^\s"]+|"([^"]*)"`)
    a := r.FindAllString(s, -1)

    out := strings.Join(a, ", ")
    fmt.Println(out) // Foo, bar, random, "letters lol", stuff
}



回答3:


You could use regex

This (go playground) will cover all use cases for multiple words inside quotes and multiple quoted entries in your array:

package main

import (
    "fmt"
    "regexp"
)

func main() {
    s := `Foo bar random "letters lol" stuff "also will" work on "multiple quoted stuff"`       
    r := regexp.MustCompile(`[^\s"']+|"([^"]*)"|'([^']*)`) 
    arr := r.FindAllString(s, -1)       
    fmt.Println("your array: ", arr)    
}

Output will be:

[Foo, bar, random, "letters lol", stuff, "also will", work, on, "multiple quoted stuff"]

If you want to learn more about regex here is a great SO answer with super handy resources at the end - Learning Regular Expressions

Hope this helps



来源:https://stackoverflow.com/questions/47489745/splitting-a-string-at-space-except-inside-quotation-marks

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!