Regex to match when a string is present twice

前端 未结 3 447
孤独总比滥情好
孤独总比滥情好 2021-01-04 00:53

I am horrible at RegEx expressions and I just don\'t use them often enough for me to remember the syntax between uses.

I am using grepWin to search my files. I need

相关标签:
3条回答
  • something like this (depends on language and your specific task)

    \(how.*){2}\
    

    Edit: according to @CodeJockey

    \^(([^h]|h[^o]|ho[^w])*how([^h]|h[^o]|ho[^w])*){2,2}$\
    

    (it become more complicated) @CodeJockey: Thanks for comments

    0 讨论(0)
  • 2021-01-04 01:49

    I don't know what grepWin supports, but here's what I came up with to make something match exactly twice.

    /^((?!how).)*how((?!how).)*how((?!how).)*$/
    

    Explanation:

    /^             # start of subject
      ((?!how).)*  # any text that does not contain "how"
      how          # the word "how"
      ((?!how).)*  # any text that does not contain "how"
      how          # the word "how"
      ((?!how).)*  # any text that does not contain "how"
    $/             # end of subject
    

    This ensures that you find two "how"s, but the texts between the "how"s and to either side of them do not contain "how".

    Of course, you can substitute any string for "how" in the expression.


    If you want to "simplify" by only writing the search expression twice, you can use backreferences thus:

    /^(?:(?!how).)*(how)(?:(?!\1).)*\1(?:(?!\1).)*$/
    

    Refiddle with this expression

    Explanation:
    I added ?: to make the negative lookaheads' text non-capturing. Then I added parentheses around the regular how to make that a capturing subpattern (the first and only one).

    I had to include "how" again in the first lookahead because it's a negative lookahead (meaning any capture would not contain "how") and the captured "how" is not captured yet at that point.

    0 讨论(0)
  • 2021-01-04 01:53

    This is significantly harder than I originally thought it would be, and requires variable-length lookbehind, which grepWin does not support...

    this expression:

     (?<!blah.{0,99999})blah(?=.*?blah)(?!.*blah.*blah)
    

    was successfully used in Eclipse, using the "Search > File" dialog to exclude files with one and three instances of blah and to include files with exactly two instances of blah.

    Eclipse does not permit a .* in lookbehind, so I used .{0,99999} instead.

    It is possible, with the right tool, but It isn't pretty to get it to work with grepWin (see answer above). Can you use other tools (such as Eclipse) and what did you want to do with the files afterwards?

    0 讨论(0)
提交回复
热议问题