Refactor a PL/pgSQL function to return the output of various SELECT queries

后端 未结 4 1267
执笔经年
执笔经年 2020-11-22 00:05

I wrote a function that outputs a PostgreSQL SELECT query well formed in text form. Now I don\'t want to output a text anymore, but actually run the generated <

相关标签:
4条回答
  • 2020-11-22 00:07
    # copy paste me into bash shell directly
    clear; IFS='' read -r -d '' sql_code << 'EOF_SQL_CODE'
    CREATE OR REPLACE FUNCTION func_get_all_users_roles()
      -- define the return type of the result set as table
      -- those datatypes must match the ones in the src
      RETURNS TABLE (
                     id           bigint
                   , email        varchar(200)
                   , password     varchar(200)
                   , roles        varchar(100)) AS
    $func$
    BEGIN
       RETURN QUERY 
       -- start the select clause
       SELECT users.id, users.email, users.password, roles.name as roles
       FROM user_roles
       LEFT JOIN roles ON (roles.guid = user_roles.roles_guid)
       LEFT JOIN users ON (users.guid = user_roles.users_guid)
       -- stop the select clause
    ;
    END
    $func$  LANGUAGE plpgsql;
    EOF_SQL_CODE
    # create the function
    psql -d db_name -c "$sql_code"; 
    
    # call the function 
    psql -d db_name -c "select * from func_get_all_users_roles() "
    
    0 讨论(0)
  • 2020-11-22 00:16

    You'll probably want to return a cursor. Try something like this (I haven't tried it):

    CREATE OR REPLACE FUNCTION data_of(integer)
      RETURNS refcursor AS
    $BODY$
    DECLARE
          --Declaring variables
          ref refcursor;
    BEGIN
          -- make sure `sensors`, `type`, $1 variable has valid value
          OPEN ref FOR 'SELECT Datahora,' || sensors ||
          ' FROM ' || type ||
          ' WHERE nomepcd=' || $1 ||' ORDER BY Datahora;';
          RETURN ref;
    END;
    $BODY$
    LANGUAGE 'plpgsql' VOLATILE;
    ALTER FUNCTION data_of(integer) OWNER TO postgres;
    
    0 讨论(0)
  • 2020-11-22 00:20

    I'm sorry to say but your question is very unclear. However below you'll find a self contained example how to create and use a function that returns a cursor variable. Hope it helps !

    begin;
    
    create table test (id serial, data1 text, data2 text);
    
    insert into test(data1, data2) values('one', 'un');
    insert into test(data1, data2) values('two', 'deux');
    insert into test(data1, data2) values('three', 'trois');
    
    create function generate_query(query_name refcursor, columns text[])
    returns refcursor 
    as $$
    begin
      open query_name for execute 
        'select id, ' || array_to_string(columns, ',') || ' from test order by id';
      return query_name;
    end;
    $$ language plpgsql;
    
    select generate_query('english', array['data1']);
    fetch all in english;
    
    select generate_query('french', array['data2']);
    fetch all in french;
    move absolute 0 from french; -- do it again !
    fetch all in french;
    
    select generate_query('all_langs', array['data1','data2']);
    fetch all in all_langs;
    
    -- this will raise in runtime as there is no data3 column in the test table
    select generate_query('broken', array['data3']);
    
    rollback;
    
    0 讨论(0)
  • 2020-11-22 00:22

    Dynamic SQL and RETURN type

    (I saved the best for last, keep reading!)
    You want to execute dynamic SQL. In principal, that's simple in plpgsql with the help of EXECUTE. You don't need a cursor - in fact, most of the time you are better off without explicit cursors.
    Find examples on SO with a search.

    The problem you run into: you want to return records of yet undefined type. A function needs to declare the return type with the RETURNS clause (or with OUT or INOUT parameters). In your case you would have to fall back to anonymous records, because number, names and types of returned columns vary. Like:

    CREATE FUNCTION data_of(integer)
      RETURNS SETOF record AS ...
    

    However, this is not particularly useful. This way you'd have to provide a column definition list with every call of the function. Like:

    SELECT * FROM data_of(17)
    AS foo (colum_name1 integer
          , colum_name2 text
          , colum_name3 real);
    

    But how would you even do this, when you don't know the columns beforehand?
    You could resort to a less structured document data types like json, jsonb, hstore or xml:

    • How to store a data table in database?

    But for the purpose of this question let's assume you want to return individual, correctly typed and named columns as much as possible.

    Simple solution with fixed return type

    The column datahora seems to be a given, I'll assume data type timestamp and that there are always two more columns with varying name and data type.

    Names we'll abandon in favor of generic names in the return type.
    Types we'll abandon, too, and cast all to text since every data type can be cast to text.

    CREATE OR REPLACE FUNCTION data_of(_id integer)
      RETURNS TABLE (datahora timestamp, col2 text, col3 text)
      LANGUAGE plpgsql AS
    $func$
    DECLARE
       _sensors text := 'col1::text, col2::text';  -- cast each col to text
       _type    text := 'foo';
    BEGIN
       RETURN QUERY EXECUTE '
          SELECT datahora, ' || _sensors || '
          FROM   ' || quote_ident(_type) || '
          WHERE  id = $1
          ORDER  BY datahora'
       USING  _id;
    
    END
    $func$;
    

    How does this work?

    • The variables _sensors and _type could be input parameters instead.

    • Note the RETURNS TABLE clause.

    • Note the use of RETURN QUERY EXECUTE. That is one of the more elegant ways to return rows from a dynamic query.

    • I use a name for the function parameter, just to make the USING clause of RETURN QUERY EXECUTE less confusing. $1 in the SQL-string does not refer to the function parameter but to the value passed with the USING clause. (Both happen to be $1 in their respective scope in this simple example.)

    • Note the example value for _sensors: each column is cast to type text.

    • This kind of code is very vulnerable to SQL injection. I use quote_ident() to protect against it. Lumping together a couple of column names in the variable _sensors prevents the use of quote_ident() (and is typically a bad idea!). Ensure that no bad stuff can be in there some other way, for instance by individually running the column names through quote_ident() instead. A VARIADIC parameter comes to mind ...

    Simpler with PostgreSQL 9.1+

    With version 9.1 or later you can use format() to further simplify:

    RETURN QUERY EXECUTE format('
       SELECT datahora, %s  -- identifier passed as unescaped string
       FROM   %I            -- assuming the name is provided by user
       WHERE  id = $1
       ORDER  BY datahora'
      ,_sensors, _type)
    USING  _id;
    

    Again, individual column names could be escaped properly and would be the clean way.

    Variable number of columns sharing the same type

    After your question updates it looks like your return type has

    • a variable number of columns
    • but all columns of the same type double precision (alias float8)

    As we have to define the RETURN type of a function I resort to an ARRAY type in this case, which can hold a variable number of values. Additionally, I return an array with column names, so you could parse the names out of the result, too:

    CREATE OR REPLACE FUNCTION data_of(_id integer)
      RETURNS TABLE (datahora timestamp, names text[], values float8[] ) AS
    $func$
    DECLARE
       _sensors text := 'col1, col2, col3';  -- plain list of column names
       _type    text := 'foo';
    BEGIN
       RETURN QUERY EXECUTE format('
          SELECT datahora
               , string_to_array($1)  -- AS names
               , ARRAY[%s]            -- AS values
          FROM   %s
          WHERE  id = $2
          ORDER  BY datahora'
        , _sensors, _type)
       USING  _sensors, _id;
    END
    $func$  LANGUAGE plpgsql;
    

    Various complete table types

    If you are actually trying to return all columns of a table (for instance one of the tables at the linked page, then use this simple, very powerful solution with a polymorphic type:

    CREATE OR REPLACE FUNCTION data_of(_tbl_type anyelement, _id int)
      RETURNS SETOF anyelement AS
    $func$
    BEGIN
       RETURN QUERY EXECUTE format('
          SELECT *
          FROM   %s  -- pg_typeof returns regtype, quoted automatically
          WHERE  id = $1
          ORDER  BY datahora'
        , pg_typeof(_tbl_type))
       USING  _id;
    END
    $func$ LANGUAGE plpgsql;
    

    Call (important!):

    SELECT * FROM data_of(NULL::pcdmet, 17);
    

    Replace pcdmet in the call with any other table name.

    How does this work?

    • anyelement is a pseudo data type, a polymorphic type, a placeholder for any non-array data type. All occurrences of anyelement in the function evaluate to the same type provided at run time. By supplying a value of a defined type as argument to the function, we implicitly define the return type.

    • PostgreSQL automatically defines a row type (a composite data type) for every table created, so there is a well defined type for every table. This includes temporary tables, which is convenient for ad-hoc use.

    • Any type can be NULL. So we hand in a NULL value, cast to the table type: NULL::pcdmet.

    • Now the function returns a well-defined row type and we can use SELECT * FROM data_of(...) to decompose the row and get individual columns.

    • pg_typeof(_tbl_type) returns the name of the table as object identifier type regtype. When automatically converted to text, identifiers are automatically double-quoted and schema-qualified if needed. Therefore, SQL injection is not a possible. This can even deal with schema-qualified table-names where quote_ident() would fail.

    0 讨论(0)
提交回复
热议问题