Why string.IsNullOrWhiteSpace(“\0”) is false

前端 未结 6 1820
不思量自难忘°
不思量自难忘° 2021-01-18 13:17

I faced a problem where invisible character \\0 which is pretty like a \'white space\' not considered as white space by the string.IsNullOrWhiteSpace method. I

相关标签:
6条回答
  • 2021-01-18 13:38

    '\0' character is not considered white space. See Char.IsWhitespace() for the list of characters that are considered white space.

    Use Enumerable.All() if you have your own requirements, or even just to add a few chars of your own. Something like this:

    bool IsMyKindOfWhiteSpace(string input)
    {
        char[] more = new char[] { <here goes your list of additional white space chars> };
    
        return input.All(x => Char.IsWhiteSpace(x) || more.Contains(x));
    }
    
    0 讨论(0)
  • 2021-01-18 13:38

    NULL string IS NOT the same as an empty string or white space

    0 讨论(0)
  • 2021-01-18 13:40

    U+0000 isn't whitespace, basically. char.IsWhitespace('\0') returns false, it's not listed as whitespace...

    The null part of IsNullOrWhitespace refers to the string reference itself - not the contents, if that's what you were thinking of.

    Note that strings in .NET aren't logically "null-terminated" within managed code, although in practice at the CLR level they are, for interop purposes. (The string knows its own length, but in order to make it easier to work with native code which does expect a null terminator, the CLR ensures that there's always a U+0000 after the content of the string.) If you end up with a string containing \0 you should probably fix whatever produced it to start with.

    0 讨论(0)
  • 2021-01-18 13:42

    For fun historical reasons (they are surely fun, but I wasn't able to find them), null has two meanings... The null pointer/reference (called NULL in C), and the NUL (or NULL) \0 character.

    String.IsNullOrWhiteSpace does:

    Indicates whether a specified string is null, empty, or consists only of white-space characters.

    with null meaning the "null reference", empty meaning empty and white-space meaning

    White-space characters are defined by the Unicode standard. The IsNullOrWhiteSpace method interprets any character that returns a value of true when it is passed to the Char.IsWhiteSpace method as a white-space character.

    The list of characters that Char.IsWhiteSpace considers a space is present in the page of Char.IsWhiteSpace.

    0 讨论(0)
  • 2021-01-18 13:45

    Create an extension method which adds the null char as a check.

    public bool IsNullOrWhitespaceOrHasNullChar(this string text)
    {
       return string.IsNullOrWhiteSpace(text) || Regex.IsMatch(text, "\0");
    }
    

    Note it the null char exists anywhere in the string, it will be found and reported as such, so a string with "a\0" would return true. If that is a concern create a test which checks for a full string of \0.

    0 讨论(0)
  • 2021-01-18 13:47

    You could replace all \0 characters with the space character, and then check for whitespace.

    string.IsNullOrWhiteSpace("\0".Replace('\0', ' ');
    
    0 讨论(0)
提交回复
热议问题