Operator precedence in regular expressions

生来就可爱ヽ(ⅴ<●) 提交于 2020-06-24 08:34:27

问题


What is the default operator precedence in Oracle's regular expressions when they don't contain parentheses?

For example, given

 H|ha+

would it be evaluated as H|h and then concatenated to a as in ((H|h)a), or would the H be alternated with ha as in (H|(ha))?

Also, when does the + kick in, etc.?


回答1:


Given the Oracle doc:

Table 4-2 lists the list of metacharacters supported for use in regular expressions passed to SQL regular expression functions and conditions. These metacharacters conform to the POSIX standard; any differences in behavior from the standard are noted in the "Description" column.

And taking a look at the | value in that table:

The expression a|b matches character a or character b.

Plus taking a look at the POSIX doc:

Operator precedence The order of precedence for of operators is as follows:

  1. Collation-related bracket symbols [==] [::] [..]

  2. Escaped characters \

  3. Character set (bracket expression) []

  4. Grouping ()

  5. Single-character-ERE duplication * + ? {m,n}

  6. Concatenation

  7. Anchoring ^$

  8. Alternation |

I would say that H|ha+ would be the same as (?:H|ha+).




回答2:


Using capturing groups to demonstrate the order of evaluation, the regex H|ha+ is equivalent to the following:

(H|(h(a+)))

This is because the precedence rules (as seen below) are applied in order from the highest precedence (the lowest numbered) one to the lowest precedence (the highest numbered) one:

  • Rule 5 → (a+) The + is grouped with the a because this operator works on the preceding single character, back-reference, group (a "marked sub-expression" in Oracle parlance), or bracket expression (character class).

  • Rule 6 → (h(a+)) The h is then concatenated with the group in the preceding step.

  • Rule 8 → (H|(h(a+))) The H is then alternated with the group in the preceding step.



Precedence table from section 9.4.8 of the POSIX docs for regular expressions (there doesn't seem to be an official Oracle table):

+---+----------------------------------------------------------+
|   |             ERE Precedence (from high to low)            |
+---+----------------------------------------------------------+
| 1 | Collation-related bracket symbols | [==] [::] [..]       |
| 2 | Escaped characters                | \<special character> |
| 3 | Bracket expression                | []                   |
| 4 | Grouping                          | ()                   |
| 5 | Single-character-ERE duplication  | * + ? {m,n}          |
| 6 | Concatenation                     |                      |
| 7 | Anchoring                         | ^ $                  |
| 8 | Alternation                       | |                    |
+---+-----------------------------------+----------------------+

The table above is for Extended Regular Expressions. For Basic Regular Expressions see 9.3.7.



来源:https://stackoverflow.com/questions/36870168/operator-precedence-in-regular-expressions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!