Regex in python: is it possible to get the match, replacement, and final string?

前端 未结 2 504
再見小時候
再見小時候 2020-12-29 03:39

For doing a regex substitution, there are three things that you give it:

  • The match pattern
  • The replacement pattern
  • The original string
  • <
相关标签:
2条回答
  • 2020-12-29 04:03
    class Replacement(object):
    
        def __init__(self, replacement):
            self.replacement = replacement
            self.matched = None
            self.replaced = None
    
        def __call__(self, match):
            self.matched = match.group(0)
            self.replaced = match.expand(self.replacement)
            return self.replaced
    
    >>> repl = Replacement('not the \\1')
    >>> re.sub('(orig.*?l)', repl, 'This is the original string.')
        'This is the not the original string.'
    >>> repl.matched
        'original'
    >>> repl.replaced
        'not the original'
    

    Edit: as @F.J has pointed out, the above will remember only the last match/replacement. This version handles multiple occurrences:

    class Replacement(object):
    
        def __init__(self, replacement):
            self.replacement = replacement
            self.occurrences = []
    
        def __call__(self, match):
            matched = match.group(0)
            replaced = match.expand(self.replacement)
            self.occurrences.append((matched, replaced))
            return replaced
    
    >>> repl = Replacement('[\\1]')
    >>> re.sub('\s(\d)', repl, '1 2 3')
        '1[2][3]'
    
    >>> for matched, replaced in repl.occurrences:
       ....:     print matched, '=>', replaced
       ....:     
     2 => [2]
     3 => [3]
    
    0 讨论(0)
  • 2020-12-29 04:03

    I looked at the documentation and it seems like you can pass a function reference into the re.sub:

    import re
    
    def re_sub_verbose(pattern, replace, string):
      def substitute(match):
        print 'Matched:', match.group(0)
        print 'Replacing with:', match.expand(replace)
    
        return match.expand(replace)
    
      result = re.sub(pattern, substitute, string)
      print 'Final string:', result
    
      return result
    

    And I get this output when running re_sub_verbose("(orig.*?l)", "not the \\1", "This is the original string."):

    Matched: original
    Replacing with: not the original
    This is the not the original string.
    
    0 讨论(0)
自定义标题
段落格式
字体
字号
代码语言
提交回复
热议问题