Most efficient way to calculate hamming distance in ruby?

前端 未结 4 1617
予麋鹿
予麋鹿 2020-12-30 06:06

In ruby, what is the most efficient way to calculate the bit difference between two unsigned integers (e.g. the hamming distance)?

Eg, I have integer a = 2323409845

相关标签:
4条回答
  • 2020-12-30 06:21

    If one intends to follow c-based path, it is a good idea to add the compiler flag -msse4.2 to your makefile. This allows the compiler to generate hardware based popcnt instructions instead of using a table to generate the popcount. On my system this was approximately 2.5x faster.

    0 讨论(0)
  • 2020-12-30 06:31

    You can make use of the optimized String functions in Ruby to do the bit counting, instead of pure arithmetic. It turns out to be about 6 times faster with some quick benchmarking.

    def h2(a, b)
      (a^b).to_s(2).count("1")
    end
    

    h1 is the normal way to calculate, while h2 converts the xor into a string, and counts the number of "1"s

    Benchmark:

    ruby-1.9.2-p180:001:0>> def h1(a, b)
    ruby-1.9.2-p180:002:1*> ret = 0
    ruby-1.9.2-p180:003:1*> xor = a ^ b
    ruby-1.9.2-p180:004:1*> until xor == 0
    ruby-1.9.2-p180:005:2*> ret += 1
    ruby-1.9.2-p180:006:2*> xor &= xor - 1
    ruby-1.9.2-p180:007:2*> end
    ruby-1.9.2-p180:008:1*> ret
    ruby-1.9.2-p180:009:1*> end
    # => nil
    ruby-1.9.2-p180:010:0>> def h2(a, b)
    ruby-1.9.2-p180:011:1*> (a^b).to_s(2).count("1")
    ruby-1.9.2-p180:012:1*> end
    # => nil
    ruby-1.9.2-p180:013:0>> h1(2323409845, 1782647144)
    # => 17
    ruby-1.9.2-p180:014:0>> h2(2323409845, 1782647144)
    # => 17
    ruby-1.9.2-p180:015:0>> quickbench(10**5) { h1(2323409845, 1782647144) }
    Rehearsal ------------------------------------
       2.060000   0.000000   2.060000 (  1.944690)
    --------------------------- total: 2.060000sec
    
           user     system      total        real
       1.990000   0.000000   1.990000 (  1.958056)
    # => nil
    ruby-1.9.2-p180:016:0>> quickbench(10**5) { h2(2323409845, 1782647144) }
    Rehearsal ------------------------------------
       0.340000   0.000000   0.340000 (  0.333673)
    --------------------------- total: 0.340000sec
    
           user     system      total        real
       0.320000   0.000000   0.320000 (  0.326854)
    # => nil
    ruby-1.9.2-p180:017:0>> 
    
    0 讨论(0)
  • 2020-12-30 06:31

    Per the suggestion of mu is too short, I wrote a simple C extension to use __builtin_popcount , and using benchmark verified that it is at least 3X faster than ruby's optimized string functions..

    I looked at the following two tutorials:

    • Extending Ruby With C
    • Ruby Extension in C in 5 min

    In my program:

    require './FastPopcount/fastpopcount.so'
    include FastPopcount
    
    def hamming(a,b)
      popcount(a^b)
    end
    

    Then in the dir containing my program, I create a folder "PopCount" with the following files.

    extconf.rb:

    # Loads mkmf which is used to make makefiles for Ruby extensions
    require 'mkmf'
    
    # Give it a name
    extension_name = 'fastpopcount'
    
    # The destination
    dir_config(extension_name)
    
    # Do the work
    create_makefile(extension_name)
    

    popcount.c:

    // Include the Ruby headers and goodies
    #include "ruby.h"
    
    // Defining a space for information and references about the module to be stored internally
    VALUE FastPopcount = Qnil;
    
    // Prototype for the initialization method - Ruby calls this, not you
    void Init_fastpopcount();
    
    // Prototype for our method 'popcount' - methods are prefixed by 'method_' here
    VALUE method_popcount(int argc, VALUE *argv, VALUE self);
    
    // The initialization method for this module
    void Init_fastpopcount() {
        FastPopcount = rb_define_module("FastPopcount");
        rb_define_method(FastPopcount, "popcount", method_popcount, 1); 
    }
    
    // Our 'popcount' method.. it uses the builtin popcount
    VALUE method_popcount(int argc, VALUE *argv, VALUE self) {
        return INT2NUM(__builtin_popcount(NUM2UINT(argv)));
    }
    

    Then in the popcount directory run:

    ruby extconf.rb make

    Then run the program, and there you have it....fastest way to do hamming distance in ruby.

    0 讨论(0)
  • 2020-12-30 06:31

    An algorithm of Wegner:

    def hamm_dist(a, b)
      dist = 0
      val = a ^ b
    
      while not val.zero?
        dist += 1
        val &= val - 1
      end
      dist
    end
    
    p hamm_dist(2323409845, 1782647144) # => 17 
    
    0 讨论(0)
提交回复
热议问题