问题
Below is an example of what I'm trying to achieve in a Redshift Database.
I have a variable current_value
and I want to create a new column value_desired
that is:
- the same as
current_value
if the previous row is null - equal to the last preceding non-null value if the previous row is non-null
It sounds like an easy task but I haven't found a way to do it yet.
row_numb current_value value_desired
1
2
3 47 47
4
5 45 45
6
7
8 42 42
9 41 42
10 40 42
11 39 42
12 38 42
13
14 36 36
15
16
17 33 33
18 32 33
I've tried with the LAG() function but I can only get the previous value (not the first in the "non-null" block), here is my take:
SELECT *
, CASE WHEN current_value is not null and LAG(current_value) is null THEN current_value
WHEN current_value is not null and LAG(current_value) is not null
THEN LAG(current_value)
ELSE NULL END AS value_desired
FROM test1
Any help is much appreciated, thanks.
回答1:
Here is the correct answer, which gives the right results. There are a few clever tricks here, i suggest you take a careful look through and let me know what needs clarifications
create test data, as per your question.
drop table if exists test_table ;
create table test_table (row_num int,current_value int);
insert into test_table(row_num, current_value)
values
(1,null),
(2,null),
(3,47),
(4,null),
(5,45),
(6,null),
(7,null),
(8 ,42),
(9 ,41),
(10,40 ),
(11,39 ),
(12,38 ),
(13,null),
(14,36),
(15,null),
(16,null),
(17 ,33),
(18,32 )
;
Then run this code
SELECT DISTINCT
j1.row_num,
CASE WHEN j1.current_value IS NULL
THEN NULL
ELSE
last_value(j2.current_value)
OVER (
PARTITION BY j1.row_num
ORDER BY j2.row_num
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) END AS value_desired
FROM test_table AS j1
LEFT JOIN (SELECT
row_num,
current_value,
lag(current_value, 1)
OVER (
ORDER BY row_num ) AS prev_cval
FROM test_table) AS j2
ON j1.row_num >= j2.row_num AND j2.current_value IS NOT NULL
AND j2.prev_cval IS NULL
ORDER BY j1.row_num;
回答2:
Use FIRST_VALUE() instead of LAG()
refer: FIRST_VALUE and LAST_VALUE Window Functions
回答3:
Our test data
DROP TABLE IF EXISTS test_table;
CREATE TABLE test_table (
row_num INT,
current_value INT
);
INSERT INTO test_table (row_num, current_value)
VALUES
(1, NULL),
(2, NULL),
(3, 47),
(4, NULL),
(5, 45),
(6, NULL),
(7, NULL),
(8, 42),
(9, 41),
(10, 40),
(11, 39),
(12, 38),
(13, NULL),
(14, 36),
(15, NULL),
(16, NULL),
(17, 33),
(18, 32);
What we know:
- when current_value is not null and preceding current_value is null, desired_value is equal current_value
- let's refer to this as first_desired_value, as per q_first_desired_value subquery below
- when that first_desired_value is not null, it is our desired_value
- we just need to propagate that first_desired_value to other rows, when current_value is not null
- when current_value is not null, desired value is the last first_desired_value from all preceding rows (beside the current row) and excluding NULL values that might have been added in previous frames
Given the above, the query would be as follows
WITH q_first_desired_value AS
(
SELECT
row_num,
current_value,
CASE WHEN LAG(current_value, 1)
OVER (
ORDER BY row_num ) IS NULL
THEN current_value
ELSE NULL END AS first_desired_value
FROM test_table
ORDER BY row_num
)
SELECT
row_num,
current_value,
CASE WHEN first_desired_value IS NOT NULL
THEN first_desired_value
WHEN current_value IS NOT NULL
THEN LAST_VALUE(first_desired_value) IGNORE NULLS
OVER (
ORDER BY row_num ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ) END AS desired_value
FROM q_first_desired_value;
来源:https://stackoverflow.com/questions/47407785/lag-to-get-first-non-null-value-since-the-previous-null-value