Splitting a string at all whitespace

前端 未结 7 2043
刺人心
刺人心 2021-01-17 11:06

I need to split a string at all whitespace, it should ONLY contain the words themselves.

How can I do this in vb.net?

Tabs, Newlines, etc. must all be split

相关标签:
7条回答
  • 2021-01-17 11:08

    If you want to avoid regex, you can do it like this:

    "Lorem ipsum dolor sit amet, consectetur adipiscing elit"
        .Split()
        .Where(x => x != string.Empty)
    

    Visual Basic equivalent:

    "Lorem ipsum dolor sit amet, consectetur adipiscing elit" _
        .Split() _
        .Where(Function(X$) X <> String.Empty)
    

    The Where() is important since, if your string has multiple white space characters next to each other, it removes the empty strings that will result from the Split().

    At the time of writing, the currently accepted answer (https://stackoverflow.com/a/1563000/49241) does not take this into account.

    0 讨论(0)
  • 2021-01-17 11:11

    I found I used the solution as noted by Adam Ralph, plus the VB.NET comment below by P57, but with one odd exception. I found I had to add .ToList.ToArray on the end.

    Like so:

    .Split().Where(Function(x) x <> String.Empty).ToList.ToArray
    

    Without that, I kept getting "Unable to cast object of type 'WhereArrayIterator`1[System.String]' to type 'System.String[]'."

    0 讨论(0)
  • 2021-01-17 11:19
    Dim words As String = "This is a list of words, with: a bit of punctuation" + _
                              vbTab + "and a tab character." + vbNewLine
    Dim split As String() = words.Split(New [Char]() {" "c, CChar(vbTab), CChar(vbNewLine) })
    
    0 讨论(0)
  • 2021-01-17 11:24

    String.Split() will split on every single whitespace, so the result will contain empty strings usually. The Regex solution Ruben Farias has given is the correct way to do it. I have upvoted his answer but I want to give a small addition, dissecting the regex:

    \s is a character class that matches all whitespace characters.

    In order to split the string correctly when it contains multiple whitespace characters between words, we need to add a quantifier (or repetition operator) to the specification to match all whitespace between words. The correct quantifier to use in this case is +, meaning "one or more" occurrences of a given specification. While the syntax "\s+" is sufficient here, I prefer the more explicit "[\s]+".

    0 讨论(0)
  • 2021-01-17 11:28

    String.Split() (no parameters) does split on all whitespace (including LF/CR)

    0 讨论(0)
  • 2021-01-17 11:28

    Try this:

    Regex.Split("your string here", "\s+")
    
    0 讨论(0)
提交回复
热议问题