What's the easiest way to represent a bytea as a single integer in PostgreSQL?

后端 未结 4 1418
星月不相逢
星月不相逢 2021-01-05 15:09

I have a bytea column that contains 14 bytes of data. The last 3 bytes of the 14 contain the CRC code of the data. I would like to extract the CRC as a single i

相关标签:
4条回答
  • 2021-01-05 15:48

    Well if we're going to do byte-by-byte operations, then bit shifting is probably much more efficient than multiplication.

    Based on Clodoaldo Neto's answer I would then say:

    select (get_byte(arm_data, 11) << 16) |
           (get_byte(arm_data, 12) << 8) |
           (get_byte(arm_data, 13))
                from adsb_raw_message;
    

    Does everyone agree?

    0 讨论(0)
  • 2021-01-05 16:01

    If you want to store the CRC as a single integer in a separate column, I suggest converting it at insert- or update-time; then persist it together with the value for the bytea.

    You can do this in your application/business layer or use an insert/update trigger to fill the CRC column.

    0 讨论(0)
  • 2021-01-05 16:05

    Another way is to extract the last 6 characters in hex representation, prepend an x and cast directly:

    db=# SELECT ('x' || right('\x00000000000001'::bytea::text, 6))::bit(24)::int;
     int4
    ------
        1
    

    .. which is a bit shorter than the get_byte() route, but is also an undocumented feature of PostgreSQL. However, I quote Tom Lane here:

    This is relying on some undocumented behavior of the bit-type input converter, but I see no reason to expect that would break. A possibly bigger issue is that it requires PG >= 8.3 since there wasn't a text to bit cast before that.

    Details in this related answer:

    • Convert hex in text representation to decimal number

    This assumes that your setting of bytea_output is hex, which is the default since version 9.0. To be sure, you can test / set it for your session:

    SET bytea_output = 'hex';
    

    More here:

    • PostgreSQL 9.X bytea representation in 'hex' or 'escape' for thumbnail images

    Performance

    I ran a test (best of 10) on a table with 10k rows. get_byte() is actually a bit faster in Postgres 9.1:

    CREATE TEMP TABLE t (a bytea);
    INSERT INTO t
    SELECT (12345670000000 + generate_series(1,10000))::text::bytea;
    

    Bit shifting is about as fast as multiplying / adding:

    SELECT 
     ('x' || right(a::text, 6))::bit(24)::int                           -- 34.9 ms
    ,(get_byte(a, 11) << 16) + (get_byte(a, 12) << 8) + get_byte(a, 13) -- 27.0 ms
    ,(get_byte(a, 11) << 16) | (get_byte(a, 12) << 8) | get_byte(a, 13) -- 27.1 ms
    , get_byte(a, 11) * 65536 + get_byte(a, 12) * 256 + get_byte(a, 13) -- 27.1 ms
    FROM t
    
    0 讨论(0)
  • 2021-01-05 16:09
    select get_byte(b, 11) * 65536 + get_byte(b, 12) * 256 + get_byte(b, 13)
    from (values ('12345678901234'::bytea)) s(b);
     ?column? 
    ----------
      3289908
    
    0 讨论(0)
提交回复
热议问题