regex substitute several special characters with other special characters in Textwrangler

泪湿孤枕 提交于 2019-12-11 15:39:59

问题


The character ̈ (unicode 0x308) cannot be represented in the “Western (ISO Latin 9)” encoding.

I need to replace several (3) of this special characters in many txt-files. Ideal would be one single regex command for the TEXTWRANGLER editor application I run on my Mac so I can use in the find&replace function of Textwrangler (similar to BBedit).

Here are the 3 special chars:

  1. ä into ä
  2. ö into ö
  3. ü into ü

(please note the first letter persists of two chars (e.g. the a and the ̈ unicode 0x308) and therefore it is not WESTERN ISO LATIN compatibel.

I tried regex (groups) but I was not successfull: In TEXTWRANGLER I use the find&replace function (incl. grep=regex option)

FIND: (ä|ö|ü)+

REPLACE: \1ä , \2ö , \3ü

any idea?


回答1:


Brief

I've just tested this with Notepad++, although I'm not sure if this will work in any Mac text editor alternatives.

This method is a conditional replacement using a dictionary in regex. It's more of a hack, but it does work assuming it's supported by the text editor. Once you're done remove the dictionary from the bottom of the file.


Code

See regex in use here

(ä|ö|ü)(?=[\s\S]*Dictionary:[\s\S]*\1=([^\s=:]+))

Replacement

\2

Results

Input

ä into a
ö into o
ü into u

Input - Modified

This input includes the dictionary at the end

ä into a
ö into o
ü into u

Dictionary:
ä=a
ö=o
ü=u

Output

a into a
o into o
u into u

Dictionary:
ä=a
ö=o
ü=u

Explanation

  • (ä|ö|ü) Capture either character in the group into capture group 1
  • (?=[\s\S]*Dictionary:[\s\S]*\1=([^\s=:]+)) Positive lookahead ensuring what follows matches
    • [\s\S]* Match any character any number of times
    • Dictionary: Match Dictionary: literally (this can be changed to anything, but you should make sure this is a unique string that won't be present anywhere else in your input)
    • [\s\S]* Match any character any number of times
    • \1 Match the same text as most recently matched by the first capture group
    • = Match the equal sign character = literally
    • ([^\s=:]+) Capture one or more of any character not present in the set (not whitespace, = or :) into capture group 2


来源:https://stackoverflow.com/questions/47313212/regex-substitute-several-special-characters-with-other-special-characters-in-tex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!