Find most common string in an array

前端 未结 6 1862
说谎
说谎 2020-12-24 07:42

I have this array, for example (the size is variable):

   x = [\"1.111\", \"1.122\", \"1.250\", \"1.111\"]

and I need to find the most comm

相关标签:
6条回答
  • 2020-12-24 08:12

    One pass through the hash to accumulate the counts. Use .max() to find the hash entry with the largest value.

    #!/usr/bin/ruby
    
    a = Hash.new(0)
    ["1.111", "1.122", "1.250", "1.111"].each { |num|
      a[num] += 1
    }
    
    a.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]
    

    or, roll it all into one line:

    ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]
    

    If you only want the item back add .first():

    ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first # => "1.111"
    

    The first sample I used is how it would be done in Perl usually. The second is more Ruby-ish. Both work with older versions of Ruby. I wanted to compare them, plus see how Wayne's solution would speed things up so I tested with benchmark:

    #!/usr/bin/env ruby
    
    require 'benchmark'
    
    ary = ["1.111", "1.122", "1.250", "1.111"] * 1000 
    
    def most_common_value(a)
      a.group_by { |e| e }.values.max_by { |values| values.size }.first
    end
    
    n = 1000
    Benchmark.bm(20) do |x|
      x.report("Hash.new(0)") do
        n.times do 
          a = Hash.new(0)
          ary.each { |num| a[num] += 1 }
          a.max{ |a,b| a[1] <=> b[1] }.first
        end 
      end
    
      x.report("inject:") do
        n.times do
          ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first
        end
      end
    
      x.report("most_common_value():") do
        n.times do
          most_common_value(ary)
        end
      end
    end
    

    Here's the results:

                              user     system      total        real
    Hash.new(0)           2.150000   0.000000   2.150000 (  2.164180)
    inject:               2.440000   0.010000   2.450000 (  2.451466)
    most_common_value():  1.080000   0.000000   1.080000 (  1.089784)
    
    0 讨论(0)
  • 2020-12-24 08:20

    Using the default value feature of hashes:

    >> x = ["1.111", "1.122", "1.250", "1.111"]
    >> h = Hash.new(0)
    >> x.each{|i| h[i] += 1 }
    >> h.max{|a,b| a[1] <=> b[1] }
    ["1.111", 2]
    
    0 讨论(0)
  • 2020-12-24 08:24

    Ruby < 2.2

    #!/usr/bin/ruby1.8
    
    def most_common_value(a)
      a.group_by do |e|
        e
      end.values.max_by(&:size).first
    end
    
    x = ["1.111", "1.122", "1.250", "1.111"]
    p most_common_value(x)    # => "1.111"
    

    Note: Enumberable.max_by is new with Ruby 1.9, but it has been backported to 1.8.7

    Ruby >= 2.2

    Ruby 2.2 introduces the Object#itself method, with which we can make the code more concise:

    def most_common_value(a)
      a.group_by(&:itself).values.max_by(&:size).first
    end
    

    As a monkey patch

    Or as Enumerable#mode:

    Enumerable.class_eval do
      def mode
        group_by do |e|
          e
        end.values.max_by(&:size).first
      end
    end
    
    ["1.111", "1.122", "1.250", "1.111"].mode
    # => "1.111"
    
    0 讨论(0)
  • 2020-12-24 08:24

    You could sort the array and then loop over it once. In the loop just keep track of the current item and the number of times it is seen. Once the list ends or the item changes, set max_count == count if count > max_count. And of course keep track of which item has the max_count.

    0 讨论(0)
  • 2020-12-24 08:35

    It will return most popular value in array

    x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
    

    IE:

    x = ["1.111", "1.122", "1.250", "1.111"]
    # Most popular
    x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
    #=> "1.111
    # How many times
    x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[1].size
    #=> 2
    
    0 讨论(0)
  • 2020-12-24 08:37

    You could create a hashmap that stores the array items as keys with their values being the number of times that element appears in the array.

    Pseudo Code:

    ["1.111", "1.122", "1.250", "1.111"].each { |num|
      count=your_hash_map.get(num)
      if(item==nil)
        hashmap.put(num,1)
      else
        hashmap.put(num,count+1)
    }
    

    As already mentioned, sorting might be faster.

    0 讨论(0)
提交回复
热议问题