Regular Expression to find a string included between two characters while EXCLUDING the delimiters

前端 未结 12 2508
旧时难觅i
旧时难觅i 2020-11-21 23:49

I need to extract from a string a set of characters which are included between two delimiters, without returning the delimiters themselves.

A simple example should b

相关标签:
12条回答
  • 2020-11-22 00:25

    I had the same problem using regex with bash scripting. I used a 2-step solution using pipes with grep -o applying

     '\[(.*?)\]'  
    

    first, then

    '\b.*\b'
    

    Obviously not as efficient at the other answers, but an alternative.

    0 讨论(0)
  • 2020-11-22 00:27

    [^\[] Match any character that is not [.

    + Match 1 or more of the anything that is not [. Creates groups of these matches.

    (?=\]) Positive lookahead ]. Matches a group ending with ] without including it in the result.

    Done.

    [^\[]+(?=\])
    

    Proof.

    http://regexr.com/3gobr

    Similar to the solution proposed by null. But the additional \] is not required. As an additional note, it appears \ is not required to escape the [ after the ^. For readability, I would leave it in.

    Does not work in the situation in which the delimiters are identical. "more or less" for example.

    0 讨论(0)
  • 2020-11-22 00:31

    Here is how I got without '[' and ']' in C#:

            var text = "This is a test string [more or less]";
            //Getting only string between '[' and ']'
            Regex regex = new Regex(@"\[(.+?)\]");
            var matchGroups = regex.Matches(text);
            for (int i = 0; i < matchGroups.Count; i++)
            {
                Console.WriteLine(matchGroups[i].Groups[1]);
            }
    

    The output is:

    more or less
    
    0 讨论(0)
  • 2020-11-22 00:32

    To remove also the [] use:

    \[.+\]
    
    0 讨论(0)
  • 2020-11-22 00:36

    I wanted to find a string between / and #, but # is sometimes optional. Here is the regex I use:

      (?<=\/)([^#]+)(?=#*)
    
    0 讨论(0)
  • 2020-11-22 00:37

    Easy done:

    (?<=\[)(.*?)(?=\])
    

    Technically that's using lookaheads and lookbehinds. See Lookahead and Lookbehind Zero-Width Assertions. The pattern consists of:

    • is preceded by a [ that is not captured (lookbehind);
    • a non-greedy captured group. It's non-greedy to stop at the first ]; and
    • is followed by a ] that is not captured (lookahead).

    Alternatively you can just capture what's between the square brackets:

    \[(.*?)\]
    

    and return the first captured group instead of the entire match.

    0 讨论(0)
提交回复
热议问题