Regex replace character within match of nested parenthesis, or replace only within text outside of match

徘徊边缘 提交于 2020-07-16 09:40:50

问题


I'm writing an AutoHotkey script that will format SQL statements from text selected on the screen. I want to turn a statement like this:

SELECT Name AS [Object Name], Switch([Type]=5,'Query',[Type]=-32768,'Form',[Type]=6,'Table') AS [Object Type], Switch([Type]=5,1,[Type]=-32768,2,[Type] In (1,4,6),6) AS [Object Type ID], Left(Name,4) as Prefix, LTrim(RTrim(Mid([Name],5,30))) as Suffix

into this:

SELECT Name AS [Object Name], 
    Switch([Type]=5,'Query',[Type]=-32768,'Form',[Type]=6,'Table') AS [Object Type], 
    Switch([Type]=5,1,[Type]=-32768,2,[Type] In (1,4,6),6) AS [Object Type ID], 
    Left(Name,4) as Prefix,
    LTrim(RTrim(Mid([Name],5,30))) as Suffix

I started by replacing commas with comma+carriage return+tab but when I encountered SQL statements containing functions using commas within parenthesis it produced undesirable results. My first solution was to exclude commas within parenthesis, with this AutoHotkey RegEx command:

; Find commas not in parenthesis and suffix with <CR><Tab>
s := RegExReplace( s, ",(?![^()]*\))", ",`r`n" . Tab )

The problem is that sometimes parenthesis are nested, and that simple RegEx didn't work.

After some digging I found a recursive RegEx that would select the outer most parenthesis of each group.

\((?:[^()]++|(?R))*\)

Now the challenge is,

  1. how do I select everything outside of that group and find/replace within it, or
  2. how do I apply a search/replace to only text within that group?

Regex Demo

SO encourages us to answer our own question. In the process of writing this up I found a solution and I will post it below. Feel free to share solutions of your own. I would like to further my understanding of regular expressions.


回答1:


I discovered that I can use an or in my expression to find anything in parenthesis OR any comma. With this method it won't select any individual commas that are inside the parenthesis groups. (Thanks to zx81 in this post.)

 ,|\((?:[^()]++|(?R))*\)

With that expression I can use the substitution |$0| to surround each matching group with the | character. Then it is easy to find the stand-alone commas with |,| and replace with my carriage return pattern, then replace all remaining |'s with an empty string.

; AutoHotkey snippet below
s := RegExReplace( s, ",|\((?:[^()]++|(?R))*\)", "|$0|" )
s := StrReplace( s, "|,|" , ",`r`n" . A_Tab )
s := StrReplace( s, "|" , "")

Regex substitution example



来源:https://stackoverflow.com/questions/62806431/regex-replace-character-within-match-of-nested-parenthesis-or-replace-only-with

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!