问题
I have a table that informs me a error type and line number that error occurred. (The process is irrelevant at this moment). I need to group by error type and show line start and line end for each error type, resulting of a range of each error type. I need to consider gaps of lines
My table and queries was:
create table errors (
err_type varchar(10),
line integer);
insert into errors values
('type_A', 1),('type_A', 2),('type_A', 3),
('type_A', 6),('type_A', 7),
('type_B', 9),('type_B', 10),
('type_B', 12),('type_B', 13),('type_B', 14),('type_B', 15),
('type_C', 21);
select * from errors;
My data:
err_type line
----------------
type_A 1
type_A 2
type_A 3
type_A 6
type_A 7
type_B 9
type_B 10
type_B 12
type_B 13
type_B 14
type_B 15
type_C 21
I need a query to do this:
err_type line_start line_end
-------------------------------
type_A 1 3
type_A 6 7
type_B 9 10
type_B 12 15
type_C 21 21
I'm using PostgreSQL, but Oracle has a similar syntax for partitioning over
functionality.
Any suggestion?
回答1:
This is a gaps-and-islands problem. I think the simplest method is row_number()
and group by
:
select err_type, min(line), max(line)
from (select e.*, row_number() over (partition by err_type order by line) as seqnum
from errors e
) e
group by err_type, (line - seqnum)
order by err_type, min(line);
Here is a db<>fiddle.
回答2:
You could build up a query like this:
with base as (
select errors.*,
sign(line - 1 - lag(line, 1, 1) over (
partition by err_type
order by line)) as is_start
from errors
), parts as (
select base.*,
sum(is_start) over (
partition by err_type
order by line) as part
from base
)
select err_type,
min(line),
max(line)
from parts
group by err_type, part
order by err_type, part;
回答3:
If you don't want to use window/agg functions.
WITH
table_min AS
(
SELECT
a.err_type, a.line
FROM errors a
LEFT JOIN errors b ON a.err_type = b.err_type AND a.line = b.line +1
WHERE b.err_type IS NULL
),
table_max AS
(
SELECT
a.err_type, a.line
FROM errors a
LEFT JOIN errors b ON a.err_type = b.err_type AND a.line + 1 = b.line
WHERE b.err_type IS NULL
),
table_next AS
(
SELECT
mx.err_type, mx.line, mi.line AS next_line_start
FROM table_min mi
INNER JOIN table_max mx
ON mi.err_type = mx.err_type
AND mi.line > mx.line
)
SELECT
a.err_type, a.line AS line_start, b.line AS line_end
FROM table_min a
INNER JOIN table_max b ON a.err_type = b.err_type AND a.line <= b.line
LEFT JOIN table_next n ON a.err_type = n.err_type
WHERE
(b.line = n.line OR n.next_line_start = a.line OR n.line IS NULL)
ORDER BY a.line
来源:https://stackoverflow.com/questions/55658675/min-and-max-values-grouping-by-consecutive-ranges