Aggregating (x,y) coordinate point clouds in PostgreSQL

前端 未结 2 697
無奈伤痛
無奈伤痛 2020-12-01 20:42

I have a a PostgreSQL database table with the following simplified structure:

  • Device Id varchar
  • Pos_X (int)
  • Pos_Y (int)

Basica

相关标签:
2条回答
  • 2020-12-01 20:56

    Use the often overlooked built-in function width_bucket() in combination with your aggregation:

    If your coordinates run from, say, 0 to 2000 and you want to consolidate everything within squares of 5 to single points, I would lay out a grid of 10 (5*2) like this:

    SELECT device_id
         , width_bucket(pos_x, 0, 2000, 2000/10) * 10 AS pos_x
         , width_bucket(pos_y, 0, 2000, 2000/10) * 10 AS pos_y
         , count(*) AS ct -- or any other aggregate
    FROM   tbl
    GROUP  BY 1,2,3
    ORDER  BY 1,2,3;
    

    To minimize the error you could GROUP BY the grid as demonstrated, but save actual average coordinates:

    SELECT device_id
         , avg(pos_x)::int AS pos_x   -- save actual averages to minimize error
         , avg(pos_y)::int AS pos_y   -- cast if you need to
         , count(*)        AS ct      -- or any other aggregate
    FROM   tbl
    GROUP  BY
           device_id
         , width_bucket(pos_x, 0, 2000, 2000/10) * 10  -- aggregate by grid
         , width_bucket(pos_y, 0, 2000, 2000/10) * 10
    ORDER  BY 1,2,3;
    

    sqlfiddle demonstrating both alongside.

    Well, this particular case could be simpler:

    ...
    GROUP  BY
           device_id
         , (pos_x / 10) * 10          -- truncates last digit of an integer
         , (pos_y / 10) * 10
    ...
    

    But that's just because the demo grid size of 10 conveniently matches the decimal system. Try the same with a grid size of 17 or something ...


    Expand to timestamps

    You can expand this approach to cover date and timestamp values by converting them to unix epoch (number of seconds since '1970-1-1') with extract().

    SELECT extract(epoch FROM '2012-10-01 21:06:38+02'::timestamptz);
    

    When you are done, convert the result back to timestamp with time zone:

    SELECT timestamptz 'epoch' + 1349118398 * interval '1s';
    

    Or simply to_timestamp():

    SELECT to_timestamp(1349118398);
    
    0 讨论(0)
  • 2020-12-01 21:06
    select [some aggregates] group by (pos_x/5, pos_y/5); 
    

    Where instead of 5 you can have any number depending how much aggregation you need/

    0 讨论(0)
提交回复
热议问题