In Bash regular expressions do `^` and `$` refer to lines, or to the entire string?

夙愿已清 提交于 2021-01-27 14:11:38

问题


In The Linux Documentation Project (I didn't find details about the regex metacharacters in the Bash manual), the metachars ^ and $ are defined as matching lines:

^: Matches the empty string at the beginning of a line [...]
$: Matches the empty string at the end of a line

however, when I try, this is not correct:

$ string="a
> b
> c"

$ [[ $string =~ ^a ]] && echo BOS match
BOS match

$ [[ $string =~ ^b ]] && echo BOL match
# nothing

Are the manuals really wrong, or I am missing something?


回答1:


^ matches start of the whole input string and $ matches the end of the whole input string in a POSIX regex (Bash uses POSIX ERE). The document you link to mentions lines because most text processing tools, like sed, grep or awk read the input line by line by default, and string coincides with the line in the majority of cases.

See POSIX regex documentation:

9.3.8 BRE Expression Anchoring

A BRE can be limited to matching strings that begin or end a line; this is called "anchoring". The circumflex and dollar sign special characters shall be considered BRE anchors in the following contexts:

  1. A circumflex ( '^' ) shall be an anchor when used as the first character of an entire BRE. The implementation may treat the circumflex as an anchor when used as the first character of a subexpression. The circumflex shall anchor the expression (or optionally subexpression) to the beginning of a string; only sequences starting at the first character of a string shall be matched by the BRE. For example, the BRE "^ab" matches "ab" in the string "abcdef", but fails to match in the string "cdefab". The BRE "(^ab)" may match the former string. A portable BRE shall escape a leading circumflex in a subexpression to match a literal circumflex.

  2. A dollar sign ( '$' ) shall be an anchor when used as the last character of an entire BRE. The implementation may treat a dollar sign as an anchor when used as the last character of a subexpression. The dollar sign shall anchor the expression (or optionally subexpression) to the end of the string being matched; the dollar sign can be said to match the end-of-string following the last character.

  3. A BRE anchored by both '^' and '$' shall match only an entire string. For example, the BRE "^abcdef$" matches strings consisting only of "abcdef".



来源:https://stackoverflow.com/questions/59653774/in-bash-regular-expressions-do-and-refer-to-lines-or-to-the-entire-stri

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!