What is the fastest way to sort a Hash?

后端 未结 3 1821
慢半拍i
慢半拍i 2021-01-21 07:07

People often ask what is the best way to sort a hash, but then they don\'t ask the needed follow-up question about what is the fastest way, which really determ

相关标签:
3条回答
  • 2021-01-21 07:49

    This is a comparison of sort and sort_by when accessing a more complex object:

    require 'fruity'
    
    RUBY_VERSION # => "2.2.2"
    
    class Foo
      attr_reader :key
      def initialize(k)
        @key = k
      end
    
      def <=>(b)
        self.key <=> b.key
      end
    end
    
    HASH = Hash[[*(1..100)].shuffle.map{ |k| [Foo.new(k), 1] }]
    compare do
      _sort1 { HASH.sort.to_h }
      _sort_by { HASH.sort_by{ |k,v| k.key }.to_h }
    end
    # >> Running each test 32 times. Test will take about 1 second.
    # >> _sort_by is faster than _sort1 by 2.7x ± 0.1
    
    0 讨论(0)
  • 2021-01-21 08:01

    What is the fastest way to sort a Hash?

    require 'fruity'
    
    HASH = Hash[('a'..'z').to_a.shuffle.map{ |k| [k, 1] }]
    
    def sort_hash1(h)
      h.sort.to_h
    end
    
    def sort_hash2(h)
      Hash[h.sort]
    end
    
    def sort_hash3(h)
      Hash[h.sort_by{ |k, v| k }]
    end
    
    def sort_keys(h)
      keys = h.keys.sort
      Hash[keys.zip(h.values_at(*keys))]
    end
    
    puts "Running on Ruby v#{ RUBY_VERSION }"
    puts
    
    compare do
      do_sort_hash1 { sort_hash1(HASH) } if [].respond_to?(:to_h)
      do_sort_hash2 { sort_hash2(HASH) }
      do_sort_hash3 { sort_hash3(HASH) }
      do_sort_keys { sort_keys(HASH) }
    end
    

    Running the above code on a Mac OS laptop results in the following output:

    # >> Running on Ruby v2.2.2
    # >> 
    # >> Running each test 256 times. Test will take about 1 second.
    # >> do_sort_keys is faster than do_sort_hash3 by 39.99999999999999% ± 10.0%
    # >> do_sort_hash3 is faster than do_sort_hash1 by 1.9x ± 0.1
    # >> do_sort_hash1 is similar to do_sort_hash2
    

    And:

    # >> Running on Ruby v1.9.3
    # >> 
    # >> Running each test 256 times. Test will take about 1 second.
    # >> do_sort_keys is faster than do_sort_hash3 by 19.999999999999996% ± 10.0%
    # >> do_sort_hash3 is faster than do_sort_hash2 by 4x ± 0.1
    

    Doubling the hash size:

    HASH = Hash[[*('a'..'z'), *('A'..'Z')].shuffle.map{ |k| [k, 1] }]
    

    Results in:

    # >> Running on Ruby v2.2.2
    # >> 
    # >> Running each test 128 times. Test will take about 1 second.
    # >> do_sort_keys is faster than do_sort_hash3 by 50.0% ± 10.0%
    # >> do_sort_hash3 is faster than do_sort_hash1 by 2.2x ± 0.1
    # >> do_sort_hash1 is similar to do_sort_hash2
    

    And:

    # >> Running on Ruby v1.9.3
    # >> 
    # >> Running each test 128 times. Test will take about 1 second.
    # >> do_sort_keys is faster than do_sort_hash3 by 30.000000000000004% ± 10.0%
    # >> do_sort_hash3 is faster than do_sort_hash2 by 4x ± 0.1
    

    The values will change depending on the hardware, but the relative results shouldn't change.

    Fruity was chosen over using the built-in Benchmark class for simplicity.

    This was prompted by "Sort hash by key, return hash in Ruby".

    0 讨论(0)
  • 2021-01-21 08:08

    Here are some more interesting things to consider:

    require 'fruity'
    
    puts "Running Ruby v#{ RUBY_VERSION }"
    # >> Running Ruby v2.2.2
    
    require 'fruity'
    
    puts "Running Ruby v#{ RUBY_VERSION }"
    # >> Running Ruby v2.2.2
    

    This looks at the differences using an integer as a key:

    HASH = Hash[[*(1..100)].shuffle.map{ |k| [k, 1] }]
    compare do
      _sort1 { HASH.sort.to_h }
      _sort2 { HASH.sort{ |a, b| a[0] <=> b[0] }.to_h }
      _sort3 { HASH.sort{ |a, b| a.first <=> b.first }.to_h }
      _sort_by { HASH.sort_by{ |k,v| k }.to_h }
    end
    # >> Running each test 64 times. Test will take about 1 second.
    # >> _sort_by is faster than _sort2 by 70.0% ± 1.0%
    # >> _sort2 is faster than _sort3 by 19.999999999999996% ± 1.0%
    # >> _sort3 is faster than _sort1 by 19.999999999999996% ± 1.0%
    

    This looks at the differences using a single-character string as the key:

    HASH = Hash[[*('a'..'Z')].shuffle.map{ |k| [k, 1] }]
    compare do
      _sort1 { HASH.sort.to_h }
      _sort2 { HASH.sort{ |a, b| a[0] <=> b[0] }.to_h }
      _sort3 { HASH.sort{ |a, b| a.first <=> b.first }.to_h }
      _sort_by { HASH.sort_by{ |k,v| k }.to_h }
    end
    # >> Running each test 16384 times. Test will take about 1 second.
    # >> _sort1 is similar to _sort3
    # >> _sort3 is similar to _sort2
    # >> _sort2 is faster than _sort_by by 1.9x ± 0.1
    
    0 讨论(0)
提交回复
热议问题