Python Regex to find a string in double quotes within a string

后端 未结 4 1297
自闭症患者
自闭症患者 2020-11-29 06:06

A code in python using regex that can perform something like this

Input: Regex should return \"String 1\" or \"String 2\" or \"String3\" 
Output: String 1,St         


        
相关标签:
4条回答
  • 2020-11-29 06:23

    The highly up-voted answer doesn't account for the possibility that the double-quoted string might contain one or more double-quote characters (properly escaped, of course). To handle this situation, the regex needs to accumulate characters one-by-one with a positive lookahead assertion stating that the current character is not a double-quote character that is not preceded by a backslash (which requires a negative lookbehind assertion):

    "(?:(?:(?!(?<!\\)").)*)"
    

    See Regex Demo

    import re
    import ast
    
    
    def doit(text):
        matches=re.findall(r'"(?:(?:(?!(?<!\\)").)*)"',text)
        for match in matches:
            print(match, '=>', ast.literal_eval(match))
    
    
    doit('Regex should return "String 1" or "String 2" or "String3" and "\\"double quoted string\\"" ')
    

    Prints:

    "String 1" => String 1
    "String 2" => String 2
    "String3" => String3
    "\"double quoted string\"" => "double quoted string"
    
    0 讨论(0)
  • 2020-11-29 06:25

    Just try to fetch double quoted strings from the multiline string:

    import re
    
    str=""" 
    "my name is daniel"  "mobile 8531111453733"[[[[[[--"i like pandas"
    "location chennai"! -asfas"aadhaar du2mmy8969769##69869" 
    @4343453 "pincode 642002""@mango,@apple,@berry" 
    """
    print(re.findall(r'["](.*?)["]',str))
    
    0 讨论(0)
  • 2020-11-29 06:31
    import re
    r=r"'(\\'|[^'])*(?!<\\)'|\"(\\\"|[^\"])*(?!<\\)\""
    
    texts=[r'"aerrrt"',
    r'"a\"e'+"'"+'rrt"',
    r'"a""""arrtt"""""',
    r'"aerrrt',
    r'"a\"errt'+"'",
    r"'aerrrt'",
    r"'a\'e"+'"'+"rrt'",
    r"'a''''arrtt'''''",
    r"'aerrrt",
    r"'a\'errt"+'"',
          "''",'""',""]
    
    for text in texts:
         print (text,"-->",re.fullmatch(r,text))
    

    results:

    "aerrrt" --> <_sre.SRE_Match object; span=(0, 8), match='"aerrrt"'>
    "a\"e'rrt" --> <_sre.SRE_Match object; span=(0, 10), match='"a\\"e\'rrt"'>
    "a""""arrtt""""" --> None
    "aerrrt --> None
    "a\"errt' --> None
    'aerrrt' --> <_sre.SRE_Match object; span=(0, 8), match="'aerrrt'">
    'a\'e"rrt' --> <_sre.SRE_Match object; span=(0, 10), match='\'a\\\'e"rrt\''>
    'a''''arrtt''''' --> None
    'aerrrt --> None
    'a\'errt" --> None
    '' --> <_sre.SRE_Match object; span=(0, 2), match="''">
    "" --> <_sre.SRE_Match object; span=(0, 2), match='""'>
     --> None
    
    0 讨论(0)
  • 2020-11-29 06:34

    Here's all you need to do:

    def doit(text):      
      import re
      matches=re.findall(r'\"(.+?)\"',text)
      # matches is now ['String 1', 'String 2', 'String3']
      return ",".join(matches)
    
    doit('Regex should return "String 1" or "String 2" or "String3" ')
    # result:
    'String 1,String 2,String3'
    

    As pointed out by Li-aung Yip: (I nearly quote)

    .+? is the "non-greedy" version of .+. It makes the regular expression match the smallest number of characters it can instead of the most characters it can. The greedy version, .+, will give string 1" or "String 2" or "String 3; the non-greedy version .+? 'String 1,String 2,String3'

    In addition (Johan speaking again), if you want to accept empty strings, change .+ to .*. Star means zero or more - plus means at least one.

    0 讨论(0)
提交回复
热议问题