hamming-distance

Efficiently find binary strings with low Hamming distance in large set

房东的猫 提交于 2019-11-26 19:19:30
Problem: Given a large (~100 million) list of unsigned 32-bit integers, an unsigned 32-bit integer input value, and a maximum Hamming Distance , return all list members that are within the specified Hamming Distance of the input value. Actual data structure to hold the list is open, performance requirements dictate an in-memory solution, cost to build the data structure is secondary, low cost to query the data structure is critical. Example: For a maximum Hamming Distance of 1 (values typically will be quite small) And input: 00001000100000000000000001111101 The values:

Shortest path to transform one word into another

匆匆过客 提交于 2019-11-26 12:05:56
问题 For a Data Structures project, I must find the shortest path between two words (like \"cat\" and \"dog\" ), changing only one letter at a time. We are given a Scrabble word list to use in finding our path. For example: cat -> bat -> bet -> bot -> bog -> dog I\'ve solved the problem using a breadth first search, but am seeking something better (I represented the dictionary with a trie). Please give me some ideas for a more efficient method (in terms of speed and memory). Something ridiculous

Hamming distance on binary strings in SQL

喜你入骨 提交于 2019-11-26 10:36:00
问题 I have a table in my DB where I store SHA256 hashes in a BINARY(32) column. I\'m looking for a way to compute the Hamming distance of the entries in the column to a supplied value, i.e. something like: SELECT * FROM table ORDER BY HAMMINGDISTANCE(hash, UNHEX(<insert supplied sha256 hash here>)) ASC LIMIT 10 (in case you\'re wondering, the Hamming distance of strings A and B is defined as BIT_COUNT(A^B) , where ^ is the bitwise XOR operator and BIT_COUNT returns the number of 1s in the binary

Efficiently find binary strings with low Hamming distance in large set

喜欢而已 提交于 2019-11-26 08:54:58
问题 Problem: Given a large (~100 million) list of unsigned 32-bit integers, an unsigned 32-bit integer input value, and a maximum Hamming Distance, return all list members that are within the specified Hamming Distance of the input value. Actual data structure to hold the list is open, performance requirements dictate an in-memory solution, cost to build the data structure is secondary, low cost to query the data structure is critical. Example: For a maximum Hamming Distance of 1 (values