Aggregating (x,y) coordinate point clouds in PostgreSQL

前端未结

关注

 2  697

無奈伤痛

I have a a PostgreSQL database table with the following simplified structure:

Device Id varchar
Pos_X (int)
Pos_Y (int)

Basica

相关标签:

2条回答

再見小時候

2020-12-01 20:56
Use the often overlooked built-in function width_bucket() in combination with your aggregation:

If your coordinates run from, say, 0 to 2000 and you want to consolidate everything within squares of 5 to single points, I would lay out a grid of 10 (5*2) like this:
```
SELECT device_id
     , width_bucket(pos_x, 0, 2000, 2000/10) * 10 AS pos_x
     , width_bucket(pos_y, 0, 2000, 2000/10) * 10 AS pos_y
     , count(*) AS ct -- or any other aggregate
FROM   tbl
GROUP  BY 1,2,3
ORDER  BY 1,2,3;
```
To minimize the error you could GROUP BY the grid as demonstrated, but save actual average coordinates:
```
SELECT device_id
     , avg(pos_x)::int AS pos_x   -- save actual averages to minimize error
     , avg(pos_y)::int AS pos_y   -- cast if you need to
     , count(*)        AS ct      -- or any other aggregate
FROM   tbl
GROUP  BY
       device_id
     , width_bucket(pos_x, 0, 2000, 2000/10) * 10  -- aggregate by grid
     , width_bucket(pos_y, 0, 2000, 2000/10) * 10
ORDER  BY 1,2,3;
```
sqlfiddle demonstrating both alongside.

Well, this particular case could be simpler:
```
...
GROUP  BY
       device_id
     , (pos_x / 10) * 10          -- truncates last digit of an integer
     , (pos_y / 10) * 10
...
```
But that's just because the demo grid size of 10 conveniently matches the decimal system. Try the same with a grid size of 17 or something ...

Expand to timestamps

You can expand this approach to cover date and timestamp values by converting them to unix epoch (number of seconds since '1970-1-1') with extract().
```
SELECT extract(epoch FROM '2012-10-01 21:06:38+02'::timestamptz);
```
When you are done, convert the result back to timestamp with time zone:
```
SELECT timestamptz 'epoch' + 1349118398 * interval '1s';
```
Or simply to_timestamp():
```
SELECT to_timestamp(1349118398);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
情深已故

2020-12-01 21:06
```
select [some aggregates] group by (pos_x/5, pos_y/5); 
```
Where instead of 5 you can have any number depending how much aggregation you need/
0 讨论(0)
发布评论:

提交评论
- 加载中...

Aggregating (x,y) coordinate point clouds in PostgreSQL

Expand to timestamps