问题
The character ̈ (unicode 0x308) cannot be represented in the “Western (ISO Latin 9)” encoding.
I need to replace several (3) of this special characters in many txt-files. Ideal would be one single regex command for the TEXTWRANGLER editor application I run on my Mac so I can use in the find&replace function of Textwrangler (similar to BBedit).
Here are the 3 special chars:
- ä into ä
- ö into ö
- ü into ü
(please note the first letter persists of two chars (e.g. the a and the ̈ unicode 0x308) and therefore it is not WESTERN ISO LATIN compatibel.
I tried regex (groups) but I was not successfull: In TEXTWRANGLER I use the find&replace function (incl. grep=regex option)
FIND: (ä|ö|ü)+
REPLACE: \1ä , \2ö , \3ü
any idea?
回答1:
Brief
I've just tested this with Notepad++, although I'm not sure if this will work in any Mac text editor alternatives.
This method is a conditional replacement using a dictionary in regex. It's more of a hack, but it does work assuming it's supported by the text editor. Once you're done remove the dictionary from the bottom of the file.
Code
See regex in use here
(ä|ö|ü)(?=[\s\S]*Dictionary:[\s\S]*\1=([^\s=:]+))
Replacement
\2
Results
Input
ä into a
ö into o
ü into u
Input - Modified
This input includes the dictionary at the end
ä into a
ö into o
ü into u
Dictionary:
ä=a
ö=o
ü=u
Output
a into a
o into o
u into u
Dictionary:
ä=a
ö=o
ü=u
Explanation
(ä|ö|ü)
Capture either character in the group into capture group 1(?=[\s\S]*Dictionary:[\s\S]*\1=([^\s=:]+))
Positive lookahead ensuring what follows matches[\s\S]*
Match any character any number of timesDictionary:
MatchDictionary:
literally (this can be changed to anything, but you should make sure this is a unique string that won't be present anywhere else in your input)[\s\S]*
Match any character any number of times\1
Match the same text as most recently matched by the first capture group=
Match the equal sign character=
literally([^\s=:]+)
Capture one or more of any character not present in the set (not whitespace,=
or:
) into capture group 2
来源:https://stackoverflow.com/questions/47313212/regex-substitute-several-special-characters-with-other-special-characters-in-tex