RegEx: Grabbing values between quotation marks

前端 未结 20 1254
暖寄归人
暖寄归人 2020-11-22 02:13

I have a value like this:

\"Foo Bar\" \"Another Value\" something else

What regex will return the

相关标签:
20条回答
  • 2020-11-22 02:31

    Unlike Adam's answer, I have a simple but worked one:

    (["'])(?:\\\1|.)*?\1
    

    And just add parenthesis if you want to get content in quotes like this:

    (["'])((?:\\\1|.)*?)\1
    

    Then $1 matches quote char and $2 matches content string.

    0 讨论(0)
  • 2020-11-22 02:32

    I liked Axeman's more expansive version, but had some trouble with it (it didn't match for example

    foo "string \\ string" bar
    

    or

    foo "string1"   bar   "string2"
    

    correctly, so I tried to fix it:

    # opening quote
    (["'])
       (
         # repeat (non-greedy, so we don't span multiple strings)
         (?:
           # anything, except not the opening quote, and not 
           # a backslash, which are handled separately.
           (?!\1)[^\\]
           |
           # consume any double backslash (unnecessary?)
           (?:\\\\)*       
           |
           # Allow backslash to escape characters
           \\.
         )*?
       )
    # same character as opening quote
    \1
    
    0 讨论(0)
  • 2020-11-22 02:32

    All the answer above are good.... except they DOES NOT support all the unicode characters! at ECMA Script (Javascript)

    If you are a Node users, you might want the the modified version of accepted answer that support all unicode characters :

    /(?<=((?<=[\s,.:;"']|^)["']))(?:(?=(\\?))\2.)*?(?=\1)/gmu
    

    Try here.

    0 讨论(0)
  • 2020-11-22 02:34

    I liked Eugen Mihailescu's solution to match the content between quotes whilst allowing to escape quotes. However, I discovered some problems with escaping and came up with the following regex to fix them:

    (['"])(?:(?!\1|\\).|\\.)*\1
    

    It does the trick and is still pretty simple and easy to maintain.

    Demo (with some more test-cases; feel free to use it and expand on it).


    PS: If you just want the content between quotes in the full match ($0), and are not afraid of the performance penalty use:

    (?<=(['"])\b)(?:(?!\1|\\).|\\.)*(?=\1)
    

    Unfortunately, without the quotes as anchors, I had to add a boundary \b which does not play well with spaces and non-word boundary characters after the starting quote.

    Alternatively, modify the initial version by simply adding a group and extract the string form $2:

    (['"])((?:(?!\1|\\).|\\.)*)\1
    

    PPS: If your focus is solely on efficiency, go with Casimir et Hippolyte's solution; it's a good one.

    0 讨论(0)
  • 2020-11-22 02:35

    The pattern (["'])(?:(?=(\\?))\2.)*?\1 above does the job but I am concerned of its performances (it's not bad but could be better). Mine below it's ~20% faster.

    The pattern "(.*?)" is just incomplete. My advice for everyone reading this is just DON'T USE IT!!!

    For instance it cannot capture many strings (if needed I can provide an exhaustive test-case) like the one below:

    $string = 'How are you? I\'m fine, thank you';

    The rest of them are just as "good" as the one above.

    If you really care both about performance and precision then start with the one below:

    /(['"])((\\\1|.)*?)\1/gm

    In my tests it covered every string I met but if you find something that doesn't work I would gladly update it for you.

    Check my pattern in an online regex tester.

    0 讨论(0)
  • 2020-11-22 02:39

    This version

    • accounts for escaped quotes
    • controls backtracking

      /(["'])((?:(?!\1)[^\\]|(?:\\\\)*\\[^\\])*)\1/
      
    0 讨论(0)
提交回复
热议问题