Regular expression to get a string between two strings in Javascript

前端 未结 11 2002
萌比男神i
萌比男神i 2020-11-21 06:29

I have found very similar posts, but I can\'t quite get my regular expression right here.

I am trying to write a regular expression which returns a string which is b

相关标签:
11条回答
  • 2020-11-21 06:53

    I find regex to be tedious and time consuming given the syntax. Since you are already using javascript it is easier to do the following without regex:

    const text = 'My cow always gives milk'
    const start = `cow`;
    const end = `milk`;
    const middleText = text.split(start)[1].split(end)[0]
    console.log(middleText) // prints "always gives"
    
    0 讨论(0)
  • 2020-11-21 06:55

    If the data is on multiple lines then you may have to use the following,

    /My cow ([\s\S]*)milk/gm
    
    My cow always gives 
    milk
    

    Regex 101 example

    0 讨论(0)
  • A lookahead (that (?= part) does not consume any input. It is a zero-width assertion (as are boundary checks and lookbehinds).

    You want a regular match here, to consume the cow portion. To capture the portion in between, you use a capturing group (just put the portion of pattern you want to capture inside parenthesis):

    cow(.*)milk
    

    No lookaheads are needed at all.

    0 讨论(0)
  • 2020-11-21 07:05

    Regular expression to get a string between two strings in JavaScript

    The most complete solution that will work in the vast majority of cases is using a capturing group with a lazy dot matching pattern. However, a dot . in JavaScript regex does not match line break characters, so, what will work in 100% cases is a [^] or [\s\S]/[\d\D]/[\w\W] constructs.

    ECMAScript 2018 and newer compatible solution

    In JavaScript environments supporting ECMAScript 2018, s modifier allows . to match any char including line break chars, and the regex engine supports lookbehinds of variable length. So, you may use a regex like

    var result = s.match(/(?<=cow\s+).*?(?=\s+milk)/gs); // Returns multiple matches if any
    // Or
    var result = s.match(/(?<=cow\s*).*?(?=\s*milk)/gs); // Same but whitespaces are optional
    

    In both cases, the current position is checked for cow with any 1/0 or more whitespaces after cow, then any 0+ chars as few as possible are matched and consumed (=added to the match value), and then milk is checked for (with any 1/0 or more whitespaces before this substring).

    Scenario 1: Single-line input

    This and all other scenarios below are supported by all JavaScript environments. See usage examples at the bottom of the answer.

    cow (.*?) milk
    

    cow is found first, then a space, then any 0+ chars other than line break chars, as few as possible as *? is a lazy quantifier, are captured into Group 1 and then a space with milk must follow (and those are matched and consumed, too).

    Scenario 2: Multiline input

    cow ([\s\S]*?) milk
    

    Here, cow and a space are matched first, then any 0+ chars as few as possible are matched and captured into Group 1, and then a space with milk are matched.

    Scenario 3: Overlapping matches

    If you have a string like >>>15 text>>>67 text2>>> and you need to get 2 matches in-between >>>+number+whitespace and >>>, you can't use />>>\d+\s(.*?)>>>/g as this will only find 1 match due to the fact the >>> before 67 is already consumed upon finding the first match. You may use a positive lookahead to check for the text presence without actually "gobbling" it (i.e. appending to the match):

    />>>\d+\s(.*?)(?=>>>)/g
    

    See the online regex demo yielding text1 and text2 as Group 1 contents found.

    Also see How to get all possible overlapping matches for a string.

    Performance considerations

    Lazy dot matching pattern (.*?) inside regex patterns may slow down script execution if very long input is given. In many cases, unroll-the-loop technique helps to a greater extent. Trying to grab all between cow and milk from "Their\ncow\ngives\nmore\nmilk", we see that we just need to match all lines that do not start with milk, thus, instead of cow\n([\s\S]*?)\nmilk we can use:

    /cow\n(.*(?:\n(?!milk$).*)*)\nmilk/gm
    

    See the regex demo (if there can be \r\n, use /cow\r?\n(.*(?:\r?\n(?!milk$).*)*)\r?\nmilk/gm). With this small test string, the performance gain is negligible, but with very large text, you will feel the difference (especially if the lines are long and line breaks are not very numerous).

    Sample regex usage in JavaScript:

    //Single/First match expected: use no global modifier and access match[1]
    console.log("My cow always gives milk".match(/cow (.*?) milk/)[1]);
    // Multiple matches: get multiple matches with a global modifier and
    // trim the results if length of leading/trailing delimiters is known
    var s = "My cow always gives milk, thier cow also gives milk";
    console.log(s.match(/cow (.*?) milk/g).map(function(x) {return x.substr(4,x.length-9);}));
    //or use RegExp#exec inside a loop to collect all the Group 1 contents
    var result = [], m, rx = /cow (.*?) milk/g;
    while ((m=rx.exec(s)) !== null) {
      result.push(m[1]);
    }
    console.log(result);

    Using the modern String#matchAll method

    const s = "My cow always gives milk, thier cow also gives milk";
    const matches = s.matchAll(/cow (.*?) milk/g);
    console.log(Array.from(matches, x => x[1]));

    0 讨论(0)
  • 2020-11-21 07:09

    I was able to get what I needed using Martinho Fernandes' solution below. The code is:

    var test = "My cow always gives milk";
    
    var testRE = test.match("cow(.*)milk");
    alert(testRE[1]);
    

    You'll notice that I am alerting the testRE variable as an array. This is because testRE is returning as an array, for some reason. The output from:

    My cow always gives milk
    

    Changes into:

    always gives
    
    0 讨论(0)
  • 2020-11-21 07:14

    Just use the following regular expression:

    (?<=My cow\s).*?(?=\smilk)
    
    0 讨论(0)
提交回复
热议问题