Matching end of line position using m flag with different line ending styles

后端 未结 1 1265
旧时难觅i
旧时难觅i 2021-01-22 10:58

I\'m trying to wrap each line that starts with \"## \" with tags. Trying to achieve a GitHub/Stackoverflow-like syntax for text formatting.

This is what I got:



        
1条回答
  •  不思量自难忘°
    2021-01-22 11:31

    It seems that in your current PCRE settings, a dot matches all chars other than LF (\n, line feed), and thus, it matches CR (\r, carriage return), and that is also a line break char.

    PCRE supports overriding of the default newline (and therefore the behavior of the $ anchor). To make the . match all characters but CR and LF, turn on the corresponding flag:

    '/(*ANYCRLF)^## (.*)$/m'
      ^^^^^^^^^^
    

    $ will assert the end of line before \r\n.

    See more about this and other verbs at rexegg.com:

    By default, when PCRE is compiled, you tell it what to consider to be a line break when encountering a . (as the dot it doesn't match line breaks unless in dotall mode), as well the ^ and $ anchors' behavior in multiline mode. You can override this default with the following modifiers:

    (*CR) Only a carriage return is considered to be a line break
    (*LF) Only a line feed is considered to be a line break (as on Unix)
    (*CRLF) Only a carriage return followed by a line feed is considered to be a line break (as on Windows)
    (*ANYCRLF) Any of the above three is considered to be a line break
    (*ANY) Any Unicode newline sequence is considered to be a line break

    For instance, (*CR)\w+.\w+ matches Line1\nLine2 because the dot is able to match the \n, which is not considered to be a line break. See demo.

    0 讨论(0)
提交回复
热议问题