How do I get the match data for all occurrences of a Ruby regular expression in a string?

后端 未结 5 1260
难免孤独
难免孤独 2020-11-27 15:16

I need the MatchData for each occurrence of a regular expression in a string. This is different than the scan method suggested in Match All Occurrences of a Reg

相关标签:
5条回答
  • 2020-11-27 15:26

    I’ll put it here to make the code available via a search:

    input = "abc12def34ghijklmno567pqrs"
    numbers = /\d+/
    input.gsub(numbers) { |m| p $~ }
    

    The result is as requested:

    ⇒ #<MatchData "12">
    ⇒ #<MatchData "34">
    ⇒ #<MatchData "567">
    

    See "input.gsub(numbers) { |m| p $~ } Matching data in Ruby for all occurrences in a string" for more information.

    0 讨论(0)
  • 2020-11-27 15:27

    You want

    "abc12def34ghijklmno567pqrs".to_enum(:scan, /\d+/).map { Regexp.last_match }
    

    which gives you

    [#<MatchData "12">, #<MatchData "34">, #<MatchData "567">] 
    

    The "trick" is, as you see, to build an enumerator in order to get each last_match.

    0 讨论(0)
  • 2020-11-27 15:32

    I'm surprised nobody mentioned the amazing StringScanner class included in Ruby's standard library:

    require 'strscan'
    
    s = StringScanner.new('abc12def34ghijklmno567pqrs')
    
    while s.skip_until(/\d+/)
      num, offset = s.matched.to_i, [s.pos - s.matched_size, s.pos - 1]
    
      # ..
    end
    

    No, it doesn't give you the MatchData objects, but it does give you an index-based interface into the string.

    0 讨论(0)
  • 2020-11-27 15:38
    input = "abc12def34ghijklmno567pqrs"
    n = Regexp.new("\\d+")
    [n.match(input)].tap { |a| a << n.match(input,a.last().end(0)+1) until a.last().nil? }[0..-2]
    
    => [#<MatchData "12">, #<MatchData "34">, #<MatchData "567">]
    
    0 讨论(0)
  • 2020-11-27 15:43

    My current solution is to add an each_match method to Regexp:

    class Regexp
      def each_match(str)
        start = 0
        while matchdata = self.match(str, start)
          yield matchdata
          start = matchdata.end(0)
        end
      end
    end
    

    Now I can do:

    numbers.each_match input do |match|
      puts "Found #{match[0]} at #{match.begin(0)} until #{match.end(0)}"
    end
    

    Tell me there is a better way.

    0 讨论(0)
提交回复
热议问题