Using regular expressions to do mass replace in Notepad++ and Vim

前端 未结 16 1543
迷失自我
迷失自我 2020-12-08 04:25

So I\'ve got a big text file which looks like the following:

相关标签:
16条回答
  • 2020-12-08 04:56

    A little after the fact, but in case its useful to anyone, I was able to follow one of the examples on here (by sdgfsdg) and quickly pick up Regular Expressions for Notepad++.

    I had to similarly pull out some redundant data from a list of HTML select dropdown options, of the form:

    <select>
      <option value="AC">saint_helena">Ascension Island</option>
      <option value="AD">andorra">Andorra</option>
      <option value="AE">united_arab_emirates">United Arab Emirates</option>
      <option value="AF">afghanistan">Afghanistan</option>:
      ...
    </select>
    

    And what I really wanted was:

    <select>
      <option value="AC">Ascension Island</option>
      <option value="AD">Andorra</option>
      <option value="AE">United Arab Emirates</option>
      <option value="AF">Afghanistan</option>
      ...
    </select>
    

    After some hair-pulling I realized that as of version 5.8.5 (Sep. 2010) the Regular Expressions still don't seem to allow certain loops in the expressions (unless there is another syntax), for example, the following would find even ">united_arab_emirated_emirates"> despite its additional separating underscores:

    (">)([a-z]+([_]*[a-z]*)*)(">)
    

    This query worked in most generic RegEx tools but while within Notepad++, I had to account for the maximum number of nested underscores (which unfortunately was 8) by hand, using the much uglier:

    (">)([a-z]+[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*)[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*(">)
    

    If someone knows a way to simulate a Regex loop in Notepad++'s replace feature, please let me know.


    Find what: *(">)([a-z]+[_][a-z][_][a-z][_][a-z][_][a-z])[_][a-z][_][a-z][_][a-z][_][a-z](">)*


    Replace with: ">


    Result: 255 occurrences were replaced.

    0 讨论(0)
  • 2020-12-08 04:58

    In vim

    :%s/<option value='.\{1,}' >//
    

    or

    :%s/<option value='.\+' >//
    

    In vim regular expressions you have to escape the one-or-more symbol, capturing parentheses, the bounded number curly braces and some others.

    See :help /magic to see which special characters need to be escaped (and how to change that).

    0 讨论(0)
  • 2020-12-08 04:58

    Vim:

    :%s/.* >//

    0 讨论(0)
  • 2020-12-08 04:59

    In Notepad++ :

    <option value value='1' >A
    <option value value='2' >B
    <option value value='3' >C
    <option value value='4' >D
    
    
    Find what: (.*)(>)(.)
    Replace with: \3
    
    Replace All
    
    
    A
    B
    C
    D
    
    0 讨论(0)
  • 2020-12-08 05:00

    It may help if you're less specific. Your expression there is "greedy", which may be interpreted different ways by different programs. Try this in vim:

    %s/^<[^>]+>//
    
    0 讨论(0)
  • 2020-12-08 05:03

    Everything before the A, B, C, etc.

    That seems so simple I must be misinterpreting you. It's just

    :%s/<.*>//
    
    0 讨论(0)
提交回复
热议问题