AWS Redshift Pivot Table all Dimensions

后端 未结 1 748
闹比i
闹比i 2021-01-25 16:53

I am following the method to pivot a large table in redshift:

Pivot a table with Amazon RedShift / PostgreSQL

However I have a large number of groups to pivot ie

相关标签:
1条回答
  • 2021-01-25 17:27

    If you want to be able to pivot to arbitrary numbers of groups you can combine the groups into a JSON string and then extract the groups you are interested in with the Redshift JSON functions. You probably do not want to do this for very large data sets.

    Here is the basic idea based on the sample data in the question linked above:

    select DimensionA, DimensionB,
        json_extract_path_text(json_pivot, 'm1') m1,
        json_extract_path_text(json_pivot, 'm2') m2
    from (
        select DimensionA, DimensionB,
            '{' || listagg(quote_ident(MetricName) || ':' || quote_ident(MetricValue), ',')
                   within group (order by MetricName) || '}' as json_pivot
        from to_pivot
        group by DimensionA, DimensionB
    )
    

    In practice you would not want to run it like that. The inner select is what you would use to generate your "pivoted" table, and the outer select shows how to reference specific group values.

    This does not account for duplicate group records for the same dim combination like the following:

    DimensionA  DimensionB  MetricName  MetricValue
    ----------  ----------  ----------  -----------
    dimA1       dimB2       m1          v13
    dimA1       dimB2       m1          v23
    

    If that is a possibility in the data then you will have to figure out how to handle that. I am not sure how it would behave as implemented. My guess is the first occurrence would be extracted.

    This could probably be done using a combination of LISTAGG and REGEXP_SUBSTR as well using two custom delimiters.

    Using varchar(max) for the JSON column type will give 65535 bytes which should be room for a couple thousand categories.

    Explained slightly differently here.

    0 讨论(0)
提交回复
热议问题