A pretty simple question: which version of CityHash is hidden behind the HASH
function of BigQuery? Is it always the latest (today v1.1), or rather a fixed version?
CityHash used in BigQuery is the version from http://code.google.com/p/cityhash/ Looking at the history, it seems like the value can change over time. This might be a good question for: https://groups.google.com/forum/?fromgroups#!forum/cityhash-discuss
BigQuery should support a consistent hash. We do have support for sha1, but right now the result is unusable because of encoding issues. You can, however, do SELECT TO_BASE64(SHA1(CONCAT('12345', 'foobar')))
Note that we will likely change SHA1
in the near future to automatically base64 encode the results. I've filed an internal bug to make this change.