发表新帖

发表新帖

How to find the closest pairs (Hamming Distance) of a string of binary bins in Ruby without O^2 issues?

后端未结

关注

 4  2081

迷失自我 2021-02-06 01:13

I\'ve got a MongoDB with about 1 million documents in it. These documents all have a string that represents a 256 bit bin of 1s and 0s, like:

01101010101010101101010101

4条回答

南笙 (楼主)

2021-02-06 02:11

This sounds like an algorithmic problem of some sort. You could try comparing those with a similar number of 1 or 0 bits first, then work down through the list from there. Those that are identical will, of course, come out on top. I don't think having tons of RAM will help here.

You could also try and work with smaller chunks. Instead of dealing with 256 bit sequences, could you treat that as 32 8-bit sequences? 16 16-bit sequences? At that point you can compute differences in a lookup table and use that as a sort of index.

Depending on how "different" you care to match on, you could just permute changes on the source binary value and do a keyed search to find the others that match.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题