Hashing a String to a Numeric Value in PostgreSQL

前端 未结 4 1350
情歌与酒
情歌与酒 2020-12-07 23:16

I need to Convert Strings stored in my Database to a Numeric value. Result can be Integer (preferred) or Bigint. This conversion is to be done at Database side in a PL/pgSQL

相关标签:
4条回答
  • 2020-12-07 23:32

    You can create a md5 hash value without problems:

    select md5('hello, world');
    

    This returns a string with a hex number.

    Unfortunately there is no built-in function to convert hex to integer but as you are doing that in PL/pgSQL anyway, this might help:

    https://stackoverflow.com/a/8316731/330315

    0 讨论(0)
  • 2020-12-07 23:37

    This is an implementation of Java's String.hashCode():

    CREATE OR REPLACE FUNCTION hashCode(_string text) RETURNS INTEGER AS $$
    DECLARE
      val_ CHAR[];
      h_ INTEGER := 0;
      ascii_ INTEGER;
      c_ char;
    BEGIN
      val_ = regexp_split_to_array(_string, '');
    
      FOR i in 1 .. array_length(val_, 1)
      LOOP
        c_ := (val_)[i];
        ascii_ := ascii(c_);
        h_ = 31 * h_ + ascii_;
        raise info '%: % = %', i, c_, h_;
      END LOOP;
    RETURN h_;
    END;
    $$ LANGUAGE plpgsql;
    
    0 讨论(0)
  • 2020-12-07 23:41

    Just keep the first 32 bits or 64 bits of the MD5 hash. Of course, it voids the main property of md5 (=the probability of collision being infinitesimal) but you'll still get a wide dispersion of values which presumably is good enough for your problem.

    SQL functions derived from the other answers:

    For bigint:

    create function h_bigint(text) returns bigint as $$
     select ('x'||substr(md5($1),1,16))::bit(64)::bigint;
    $$ language sql;
    

    For int:

    create function h_int(text) returns int as $$
     select ('x'||substr(md5($1),1,8))::bit(32)::int;
    $$ language sql;
    
    0 讨论(0)
  • 2020-12-07 23:42

    Must it be an integer? The pg_crypto module provides a number of standard hash functions (md5, sha1, etc). They all return bytea. I suppose you could throw away some bits and convert bytea to integer.

    bigint is too small to store a cryptographic hash. The largest non-bytea binary type Pg supports is uuid. You could cast a digest to uuid like this:

    select ('{'||encode( substring(digest('foobar','sha256') from 1 for 16), 'hex')||'}')::uuid;
                     uuid                 
    --------------------------------------
     c3ab8ff1-3720-e8ad-9047-dd39466b3c89
    
    0 讨论(0)
提交回复
热议问题