问题
I found an unexpected result when using regexp_replace to concatenate a string on the end of another string, as an exercise in using regexp_replace to do it. I bring it up to not only figure out why, but to let folks know of this possibly unexpected result.
Consider this statement where the intent is to tack "note 2" on the end of string "Note 1". My intention was to group the entire line, then concatenate the new string to the end:
select regexp_replace('note 1', '(.*)', '\1' || ' note 2') try_1 from dual;
But take a look at the result:
TRY_1
--------------------
note 1 note 2 note 2
The note gets repeated twice! Why?
If I change the pattern to include the start of line and end of line anchors, it works as expected:
select regexp_replace('note 1', '^(.*)$', '\1' || ' note 2') try_2 from dual;
TRY_2
-------------
note 1 note 2
Why should that make a difference?
EDIT: please see Politank-Z's explanation below. I wanted to add if I change the first example to use a plus (match 1 or more occurrences of the previous character) as opposed to the asterisk (for 0 or more occurrences of the previous character) it works as expected:
select regexp_replace('note 1', '(.+)', '\1' || ' note 2') try_3 from dual;
TRY_3
-------------
note 1 note 2
回答1:
As per the Oracle Documentation:
By default, the function returns source_char with every occurrence of the regular expression pattern replaced with replace_string.
The key there is every occurence. .*
matches the empty string, and the Oracle regexp engine is first matching the entire string, then the following empty string. By adding the anchors, you eliminate this. Alternatively, you could specify the occurrence parameter per the linked documentation.
来源:https://stackoverflow.com/questions/29473269/explain-unexpected-regexp-replace-result