generate_series() method fails in Redshift

前端 未结 7 1437
时光说笑
时光说笑 2020-11-27 22:01

When I run the SQL Query:

 select generate_series(0,g)
 from ( select date(date1) - date(date2) as g from mytable ;

It returns an error:

相关标签:
7条回答
  • 2020-11-27 22:19

    You are not using PostgreSQL. You are using Amazon Redshift.

    Amazon Redshift does not support generate_series when used with Redshift tables. It says it right there in the error message.

    Either use real PostgreSQL, or if you need Redshift's features, you must also work within the limitations of Redshift.

    Your second example works because it does not use any Redshift tables.

    0 讨论(0)
  • 2020-11-27 22:27

    The generate_series() function is not fully supported by Redshift. See the Unsupported PostgreSQL functions section of the developer guide:

    In the specific examples, the second query is executed entirely on the leader node as it does not need to scan any actual table data, while the first is trying to select data and as such would be executed on the compute node(s).

    UPDATE:

    generate_series is working with Redshift now.

    SELECT CURRENT_DATE::TIMESTAMP  - (i * interval '1 day') as date_datetime 
    FROM generate_series(1,31) i 
    ORDER BY 1
    

    This will generate date for last 30 days

    0 讨论(0)
  • 2020-11-27 22:31

    You will need to use functions that are supported by the leader node. The trick is to use the row_number() function from any table that you want. Let's say that we want to generate a date-series from 10 days ago up tp now:

       SELECT DATEADD('day', -n, (CURRENT_DATE+1)) AS generated_date
       FROM (SELECT ROW_NUMBER() OVER () AS n FROM my_table LIMIT 10) n
       ORDER BY generated_date DESC
    

    And we get:

    generated_date
    2020-06-24 00:00:00
    2020-06-23 00:00:00
    2020-06-22 00:00:00
    2020-06-21 00:00:00
    2020-06-20 00:00:00
    2020-06-19 00:00:00
    2020-06-18 00:00:00
    2020-06-17 00:00:00
    2020-06-16 00:00:00
    2020-06-15 00:00:00
    
    0 讨论(0)
  • 2020-11-27 22:33

    This works here (pg-9.3.3) Maybe your issue is just the result of a Redshift-"feature"?

    CREATE TABLE mytable
            ( date1 timestamp
            , date2 timestamp
            );
    INSERT INTO mytable(date1,date2) VALUES
    ( '2014-03-30 12:00:00' , '2014-04-01 12:00:00' );
    
    SELECT  generate_series(0, ss.g) FROM
       ( SELECT date(date2) - date(date1) AS g
         FROM mytable
       ) ss ;
    
    0 讨论(0)
  • 2020-11-27 22:35

    Why it's not working was explained above. Still, the question "what can we do about this?" is open.

    If you develop a BI system on any platform (with generators supported or not), it is very handy to have dimension tables with sequences of numbers and dates. How can you create one in Redshift?

    1. in Postgres, produce the necessary sequence using generator
    2. export to CSV
    3. create a table with the same schema in Redshift
    4. import the CSV from Step 2 to Redshift

    Imagine you have created a very simple table called calendar:

     id, date
     1, 2017-01-01
     2, 2017-01-02
     ..., ...
     xxx, 2020-01-01
    

    So your query will look like this:

    SELECT t.id, t.date_1, t.date_2, c.id as date_id, c.date
    FROM mytable t
    JOIN calendar c
    ON c.date BETWEEN t.date_1::date AND t.date_2::date
    ORDER BY 1,4
    

    In calendar table you can also have first dates of week, month, quarter, weekdays (Mon,Tue,etc.), which makes such table super effective for time-based aggregations.

    0 讨论(0)
  • 2020-11-27 22:37

    You are correct that this does not work on Redshift. See here.

    You could use something like this

    with ten_numbers as (select 1 as num union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 0)
    ,generted_numbers AS
    (
        SELECT (1000*t1.num) + (100*t2.num) + (10*t3.num) + t4.num-5000 as gen_num
        FROM ten_numbers AS t1
          JOIN ten_numbers AS t2 ON 1 = 1
          JOIN ten_numbers AS t3 ON 1 = 1
          JOIN ten_numbers AS t4 ON 1 = 1
    )
    select  gen_num from generted_numbers
    where gen_num between -10 and 0
    order by 1;
    
    0 讨论(0)
提交回复
热议问题