问题
Does anyone know how to replace nulls in a column with a string until it hits a new string then that string replaces all null values below it? I have a column that looks like this
Original Column:
PAST_DUE_COL
91 or more days pastdue
Null
Null
61-90 days past due
Null
Null
31-60 days past due
Null
0-30 days past due
Null
Null
Null
Expected Result Column:
PAST_DUE_COL
91 or more days past due
91 or more days past due
91 or more days past due
61-90 days past due
61-90 days past due
61-90 days past due
31-60 days past due
31-60 days past due
0-30 days past due
0-30 days past due
0-30 days past due
0-30 days past due
Essentially I want the first string in the column to replace all null values below it until the next string. Then that string will replace all nulls below it until the next string and so on.
回答1:
SQL Server does not support the ignore nulls
option for window functions such as lead()
and lag()
, for which this question was a nice fit.
We can work around this with some gaps and island technique:
select
t.*,
max(past_due_col) over(partition by grp) new_past_due_col
from (
select
t.*,
sum(case when past_due_col is null then 0 else 1 end)
over(order by id) grp
from mytable t
) t
The subquery does a window sum that increments everytime a non null value is found: this defines groups of rows that contain a non-null value followed by null values.
Then, the outer uses a window max()
to retrieve the (only) non-null value in each group.
This assumes that a column can be used to order the records (I called it id
).
Demo on DB Fiddle:
ID | PAST_DUE_COL | grp | new_past_due_col -: | :---------------------- | --: | :---------------------- 1 | 91 or more days pastdue | 1 | 91 or more days pastdue 2 | null | 1 | 91 or more days pastdue 3 | null | 1 | 91 or more days pastdue 4 | 61-90 days past due | 2 | 61-90 days past due 5 | null | 2 | 61-90 days past due 6 | null | 2 | 61-90 days past due 7 | 31-60 days past due | 3 | 31-60 days past due 8 | null | 3 | 31-60 days past due 9 | 0-30 days past due | 4 | 0-30 days past due 10 | null | 4 | 0-30 days past due 11 | null | 4 | 0-30 days past due 12 | null | 4 | 0-30 days past due
回答2:
This is a variation on GMBs answer. It is just a bit simpler:
select t.*,
max(past_due_col) over(partition by grp) as new_past_due_col
from (select t.*,
count(past_due_col) over (order by id) as grp
from mytable t
) t;
Note that you need an ordering column of some sort for your question to even make sense.
Another approach uses apply
:
select t.*, t2.past_due_col
from mytable t outer apply
(select top (1) t2.*
from mytable t2
where t2.id <= t.id and t2.past_due_col is not null
order by t2.id desc
) t2;
回答3:
If you have an id column and lead/lag is not available you could use:
SELECT (select top 1 PAST_DUE_COL from MyTablename
where id <= t.id and PAST_DUE_COL <> '' order by id desc)
FROM MyTablename T
来源:https://stackoverflow.com/questions/60105702/how-to-make-lag-ignore-nulls-in-sql-server