hash function providing unique uint from an integer coordinate pair

前端未结

关注

 11  1867

The problem in general: I have a big 2d point space, sparsely populated with dots. Think of it as a big white canvas sprinkled with black dots. I have to it

相关标签:

11条回答

北恋

2020-12-13 10:41
Cantor's enumeration of pairs
```
   n = ((x + y)*(x + y + 1)/2) + y
```
might be interesting, as it's closest to your original canvaswidth * y + x but will work for any x or y. But for a real world int32 hash, rather than a mapping of pairs of integers to integers, you're probably better off with a bit manipulation such as Bob Jenkin's mix and calling that with x,y and a salt.
0 讨论(0)
发布评论:

提交评论
- 加载中...
野的像风

2020-12-13 10:41

You can recursively divide your XY plane into cells, then divide these cells into sub-cells, etc.

Gustavo Niemeyer invented in 2008 his Geohash geocoding system.

Amazon's open source Geo Library computes the hash for any longitude-latitude coordinate. The resulting Geohash value is a 63 bit number. The probability of collision depends of the hash's resolution: if two objects are closer than the intrinsic resolution, the calculated hash will be identical.

Read more:

https://en.wikipedia.org/wiki/Geohash https://aws.amazon.com/fr/blogs/mobile/geo-library-for-amazon-dynamodb-part-1-table-structure/ https://github.com/awslabs/dynamodb-geo

0 讨论(0)
发布评论:

提交评论
- 加载中...
萌比男神i

2020-12-13 10:44
You can do
```
a >= b ? a * a + a + b : a + b * b
```
taken from here.

That works for points in positive plane. If your coordinates can be in negative axis too, then you will have to do:
```
A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
A >= B ? A * A + A + B : A + B * B;
```
But to restrict the output to uint you will have to keep an upper bound for your inputs. and if so, then it turns out that you know the bounds. In other words in programming its impractical to write a function without having an idea on the integer type your inputs and output can be and if so there definitely will be a lower bound and upper bound for every integer type.
```
public uint GetHashCode(whatever a, whatever b)
{
    if (a > ushort.MaxValue || b > ushort.MaxValue || 
        a < ushort.MinValue || b < ushort.MinValue)
    {    
        throw new ArgumentOutOfRangeException();
    }

    return (uint)(a * short.MaxValue + b); //very good space/speed efficiency
    //or whatever your function is.
}
```
If you want output to be strictly uint for unknown range of inputs, then there will be reasonable amount of collisions depending upon that range. What I would suggest is to have a function that can overflow but unchecked. Emil's solution is great, in C#:
```
return unchecked((uint)((a & 0xffff) << 16 | (b & 0xffff))); 
```
See Mapping two integers to one, in a unique and deterministic way for a plethora of options..
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-12-13 10:46
Like Emil, but handles 16-bit overflows in x in a way that produces fewer collisions, and takes fewer instructions to compute:
```
hash = ( y << 16 ) ^ x;
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
生来不讨喜

2020-12-13 10:47
Perhaps?
```
hash = ((y & 0xFFFF) << 16) | (x & 0xFFFF);
```
Works as long as x and y can be stored as 16 bit integers. No idea about how many collisions this causes for larger integers, though. One idea might be to still use this scheme but combine it with a compression scheme, such as taking the modulus of 2^16.
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2