Dynamically generate columns for crosstab in PostgreSQL

后端 未结 4 509
慢半拍i
慢半拍i 2020-11-27 05:12

I am trying to create crosstab queries in PostgreSQL such that it automatically generates the crosstab columns instead of hardcoding it. I have wri

相关标签:
4条回答
  • 2020-11-27 05:31

    @erwin-brandstetter: The return type of the function isn't an issue if you're always returning a JSON type with the converted results.

    Here is the function I came up with:

    CREATE OR REPLACE FUNCTION report.test(
        i_start_date TIMESTAMPTZ,
        i_end_date TIMESTAMPTZ,
        i_interval INT
        ) RETURNS TABLE (
        tab JSON
        ) AS $ab$
    DECLARE
        _key_id TEXT;
        _text_op TEXT = '';
        _ret JSON;
    BEGIN
        -- SELECT DISTINCT for query results
        FOR _key_id IN
        SELECT DISTINCT at_name
          FROM report.company_data_date cd 
          JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id 
          JOIN report.amount_types at ON cda.amount_type_id  = at.id 
         WHERE date_start BETWEEN i_start_date AND i_end_date
           AND interval_type_id = i_interval
        LOOP
        -- build function_call with datatype of column
            IF char_length(_text_op) > 1 THEN
                _text_op := _text_op || ', ' || _key_id || ' NUMERIC(20,2)';
            ELSE
                _text_op := _text_op || _key_id || ' NUMERIC(20,2)';
            END IF;
        END LOOP;
        -- build query with parameter filters
        RETURN QUERY
        EXECUTE '
            SELECT array_to_json(array_agg(row_to_json(t)))
              FROM (
            SELECT * FROM crosstab(''SELECT date_start, at.at_name,  cda.amount ct 
              FROM report.company_data_date cd 
              JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id 
              JOIN report.amount_types at ON cda.amount_type_id  = at.id 
             WHERE date_start between $$' || i_start_date::TEXT || '$$ AND $$' || i_end_date::TEXT || '$$ 
               AND interval_type_id = ' || i_interval::TEXT || ' ORDER BY date_start'') 
                AS ct (date_start timestamptz, ' || _text_op || ')
                 ) t;';
    END;
    $ab$ LANGUAGE 'plpgsql';
    

    So, when you run it, you get the dynamic results in JSON, and you don't need to know how many values were pivoted:

    select * from report.test(now()- '1 week'::interval, now(), 1);
                                                                                                                         tab                                                                                                                      
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     [{"date_start":"2015-07-27T08:40:01.277556-04:00","burn_rate":0.00,"monthly_revenue":5800.00,"cash_balance":0.00},{"date_start":"2015-07-27T08:50:02.458868-04:00","burn_rate":34000.00,"monthly_revenue":15800.00,"cash_balance":24000.00}]
    (1 row)
    

    Edit: If you have mixed datatypes in your crosstab, you can add logic to look it up for each column with something like this:

      SELECT a.attname as column_name, format_type(a.atttypid, a.atttypmod) AS data_type 
        FROM pg_attribute a 
        JOIN pg_class b ON (a.attrelid = b.relfilenode) 
        JOIN pg_catalog.pg_namespace n ON n.oid = b.relnamespace 
       WHERE n.nspname = $$schema_name$$ AND b.relname = $$table_name$$ and a.attstattarget = -1;"
    
    0 讨论(0)
  • 2020-11-27 05:41

    You can use the provided C function crosstab_hash for this.

    The manual is not very clear in this respect. It's mentioned at the end of the chapter on crosstab() with two parameters:

    You can create predefined functions to avoid having to write out the result column names and types in each query. See the examples in the previous section. The underlying C function for this form of crosstab is named crosstab_hash.

    For your example:

    CREATE OR REPLACE FUNCTION f_cross_test_db(text, text)
      RETURNS TABLE (kernel_id int, key1 int, key2 int, key3 int)
      AS '$libdir/tablefunc','crosstab_hash' LANGUAGE C STABLE STRICT;
    

    Call:

    SELECT * FROM f_cross_test_db(
          'SELECT kernel_id, key, value FROM test_db ORDER BY 1,2'
         ,'SELECT DISTINCT key FROM test_db ORDER BY 1');
    

    Note that you need to create a distinct crosstab_hash function for every crosstab function with a different return type.

    Related:

    • PostgreSQL row to columns

    Your function to generate the column list is rather convoluted, the result is incorrect (int missing after kernel_id), it can be replaced with this SQL query:

    SELECT 'kernel_id int, '
           || string_agg(DISTINCT key::text, ' int, '  ORDER BY key::text)
           || ' int, DUMMY text'
    FROM   test_db;
    

    And it cannot be used dynamically anyway.

    0 讨论(0)
  • 2020-11-27 05:49

    The approach described here worked well for me. Instead of retrieving the pivot table directly. The easier approach is to let the function generate a SQL query string. Dynamically execute the resulting SQL query string on demand.

    0 讨论(0)
  • 2020-11-27 05:52

    I realise this is an older post but struggled for a little while on the same issue.

    My Problem Statement: I had a table with muliple values in a field and wanted to create a crosstab query with 40+ column headings per row.

    My Solution was to create a function which looped through the table column to grab values that I wanted to use as column headings within the crosstab query.

    Within this function I could then Create the crosstab query. In my use case I added this crosstab result into a separate table.

    E.g.

    CREATE OR REPLACE FUNCTION field_values_ct ()
     RETURNS VOID AS $$
    DECLARE rec RECORD;
    DECLARE str text;
    BEGIN
    str := '"Issue ID" text,';
       -- looping to get column heading string
       FOR rec IN SELECT DISTINCT field_name
            FROM issue_fields
            ORDER BY field_name
        LOOP
        str :=  str || '"' || rec.field_name || '" text' ||',';
        END LOOP;
        str:= substring(str, 0, length(str));
    
        EXECUTE 'CREATE EXTENSION IF NOT EXISTS tablefunc;
        DROP TABLE IF EXISTS temp_issue_fields;
        CREATE TABLE temp_issue_fields AS
        SELECT *
        FROM crosstab(''select issue_id, field_name, field_value from issue_fields order by 1'',
                     ''SELECT DISTINCT field_name FROM issue_fields ORDER BY 1'')
             AS final_result ('|| str ||')';
    END;
    $$ LANGUAGE plpgsql;
    
    0 讨论(0)
提交回复
热议问题