Regex to remove single characters from string

前端 未结 5 1682
既然无缘
既然无缘 2020-12-21 19:05

Consider the following strings

breaking out a of a simple prison
this is b moving up
following me is x times better

All strings are lowerca

相关标签:
5条回答
  • 2020-12-21 19:08

    You could try something like this:

    preg_replace('/\b\S\s\b/', "", $subject);
    

    This is what it means:

    \b    # Assert position at a word boundary
    \S    # Match a single character that is a “non-whitespace character”
    \s    # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
    \b    # Assert position at a word boundary
    

    Update

    As raised by Radu, because I've used the \S this will match more than just a-zA-Z. It will also match 0-9_. Normally, it would match a lot more than that, but because it's preceded by \b, it can only match word characters.

    As mentioned in the comments by Tim Pietzcker, be aware that this won't work if your subject string needs to remove single characters that are followed by non word characters like test a (hello). It will also fall over if there are extra spaces after the single character like this

    test a  hello 
    

    but you could fix that by changing the expression to \b\S\s*\b

    0 讨论(0)
  • 2020-12-21 19:11

    Try this one:

    $sString = preg_replace("@\b[a-z]{1}\b@m", ' ', $sString);
    
    0 讨论(0)
  • 2020-12-21 19:20

    As a one-liner:

    $result = preg_replace('/\s\p{Ll}\b|\b\p{Ll}\s/u', '', $subject);
    

    This matches a single lowercase letter (\p{Ll}) which is preceded or followed by whitespace (\s), removing both. The word boundaries (\b) ensure that only single letters are indeed matched. The /u modifier makes the regex Unicode-aware.

    The result: A single letter surrounded by spaces on both sides is reduced to a single space. A single letter preceded by whitespace but not followed by whitespace is removed completely, as is a single letter only followed but not preceded by whitespace.

    So

    This a is my test sentence a. o How funny (what a coincidence a) this is!
    

    is changed to

    This is my test sentence. How funny (what coincidence) this is!
    
    0 讨论(0)
  • 2020-12-21 19:24
    $str = "breaking out a of a simple prison
    this is b moving up
    following me is x times better";
    $res = preg_replace("@\\b[a-z]\\b ?@i", "", $str);
    echo $res;
    
    0 讨论(0)
  • 2020-12-21 19:33

    How about:

    preg_replace('/(^|\s)[a-z](\s|$)/', '$1', $string);
    

    Note this also catches single characters that are at the beginning or end of the string, but not single characters that are adjacent to punctuation (they must be surrounded by whitespace).

    If you also want to remove characters immediately before punctuation (e.g. 'the x.'), then this should work properly in most (English) cases:

    preg_replace('/(^|\s)[a-z]\b/', '$1', $string);
    
    0 讨论(0)
提交回复
热议问题