问题
My regex:
^\+?(-?)0*([[:digit:]]+,[[:digit:]]+?)0*$
It is removing leading + and leading and tailing 0s in decimal number.
I have tested it in regex101
For input: +000099,8420000
and substitution \1\2
it returns 99,842
I want the same result in Oracle database 11g:
select REGEXP_REPLACE('+000099,8420000','^\+?(-?)0*([[:digit:]]+,[[:digit:]]+?)0*$','\1\2') from dual;
But it returns 99,8420000
(tailing 0s are still present...)
What I'm missing?
EDIT
It works like greedy quantifier *
at the end of regex, not lazy *?
but I definitely set lazy one.
回答1:
The problem is well-known for all those who worked with Henry Spencer's regex library implementations: lazy quantifiers should not be mixed up with greedy quantifiers in one and the same branch since that leads to undefined behavior. The TRE regex engine used in R shows the same behavior. While you may mix the lazy and greedy quantifiers to some extent, you must always make sure you get a consistent result.
The solution is to only use lazy quantifiers inside the capturing group:
select REGEXP_REPLACE('+000099,8420000', '^\+?(-?)0*([0-9]+?,[0-9]+?)0*$','\1\2') as Result from dual
See the online demo
The [0-9]+?,[0-9]+?
part matches 1 or more digits but as few times as possible followed with a comma and then 1 or more digits, as few as possible.
Some more tests (select REGEXP_REPLACE('+00009,010020','[0-9]+,[0-9]+?([1-9])','\1') from dual
yields +20
) prove that the first quantifier in a group sets the quantifier greediness type. In the case above, Group 0 quantifier greediness is set to greedy by the first ?
quantifier, and Group 1 (i.e. ([0-9]+?,[0-9]+?)
) greediness type is set with the first +?
(which is lazy).
来源:https://stackoverflow.com/questions/45302698/regex101-vs-oracle-regex