geohash string length and accuracy

天大地大妈咪最大 提交于 2019-11-28 17:07:06

Saw a lot of confusion around geohashing so I am posting my understanding so far. The principle behind geohash is very simple, you can create your own version. For instance consider following geo-point,

156.34234534,-23.343423345

In the above example, 156 represents degrees, 2 digits after decmal (34) represents decimal minute and rest, (34.5334) represents seconds.

If you remember school geography circumference of earth at equator is about 40,000kms and, number of degrees around the earth (latitudes or longitudes) is 360. So at the widest point each degree of latitude and longitude span equals to about 110kms (40,000/360).

So if you encode the above coordinates as, "156-23" (including negative sign), this will give you (110kmx110km) box.

You can go on and increase the precision, Fist digit of minute (156.3-23.3) will give you (10kmx10km) box (each minute span equals 1km).

Increase this to include first digit of second you get (100mx100m)box, each extra digit will add precision to another degree. Geohashing is just the way to represent the above figure in an encoded form. You can happily use the above format as well!

Was curious about this myself. If its any good to anyone I put together a spreadsheet here Not 100% sure its right - feel free to comment if you find a problem.

Judging by graph below, using 6 to 10 digits gives accuracy ~1km to ~1m at 60 degree lat.

Here are the formulas for height and width in degrees of a geohash of length n characters:

First define this function:

    parity(n) = 0 if n is even otherwise 1

Then

    height = 180 / 2(5n-parity(n))/2 degrees

    width = 180 / 2(5n+parity(n)-2)/2 degrees

Note that this is the height and width in degrees only. To convert this to metres requires that you know where on the earth the hash is.

Code for this in java is at http://github.com/davidmoten/geo.

Also any directway to calculate distance between two geo-hash? (one way is to decode them to lat/lng, and then calculate distance)

That is what you should do. Think of the geohash as just another representation of a latitude and longitude as a pair of printed decimal numbers are likewise. If I gave you a pair of lat & lon strings, you would parse them to numbers (in your programming language of choice), and then do the math. It's no different with geohashes -- decode to lat & lon then do the math.

Be very careful with any reasoning you are attempting to do with inferring closeness based on the length of the common prefix between a pair of points. If there is a long common prefix, then they are close, but the converse is not true! -- i.e. two points with no common prefix could be a millimeter apart.

Here is an equation (in pseudocode) that can approximate the optimal Geohash length for a latitude/longitude pair having a certain precision:

geohash_length = FLOOR ( LOG_2(5000000/precision_in_meters) / 2,5 + 1 )
if geohash_length > 12 then geohash_length = 12
if geohash_length < 1 then geohash_length = 1

I've used it to create the optimal Geohash from data received by the gpsddaemon, which also provide precision information via the epx and epy values.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!