Is it possible to have an SQL query that uses AGG functions in this way?

后端 未结 3 1672
孤城傲影
孤城傲影 2021-01-20 22:53

Assuming I have the following aggregate functions:

  • AGG1
  • AGG2
  • AGG3
  • AGG4

Is it possible to write valid SQL (in a db agnostic

相关标签:
3条回答
  • 2021-01-20 23:25

    The only way to aggregate over columns without using GROUP BY is to use windowing functions. You left out details of your problem, but the following might be what you are looking for:

    SELECT *
    FROM (
        SELECT [COL1, COL2 ....], 
               AGG1(param1) over (partition by some_grouping_column) as agg1, 
               AGG2(param2) over (partition by some_grouping_column) as agg2,
               row_number() over () as rn
        FROM [SOME TABLES]
        WHERE [SOME CRITERIA]
        ORDER BY COL1
    )  t
    WHERE AGG3 >-1 
      AND AGG4 < 123
      AND rn <= 10
    ORDER BY col1
    

    This is standard ANSI SQL and works on most database including PostgreSQL (since 8.4).

    Note that you do not need to use the same grouping column for both aggregates in the partition by clause.

    If you want to stick with ANSI SQL then you should use the row_number() function to limit the result. If you run this only on PostgreSQL (or other DBMS that support LIMIT in some way) move the LIMIT cause into the derived table (the inner query)

    0 讨论(0)
  • 2021-01-20 23:30

    That should work from a high level perspective, except you'd need COL1, COL2 etc in a GROUP BY statement or else they won't be valid in the SELECT list. Having AGG1, etc in the SELECT list and not in the HAVING is not a problem.

    As far as db agnostic, you're going to have to tweak syntax no matter what you do (the LIMIT for example is going to be different in PostgreSQL, SQL SERVER and Oracle that I know off the top of my head), but you could build logic to construct the statements properly for each provided your high-level representation is solid.

    0 讨论(0)
  • 2021-01-20 23:51

    PostgreSQL major version include the first digit after the dot, thus "PostgreSQL (v9.x)" is not specific enough. As @kekekela said, there is no (cheap) completely db agnostic way. Even between PostgreSQL 9.0 and 9.1 there is an important syntactical difference.

    If you had only the grouped values AGG1(param1), AGG2(param2) you would get away without providing an explicit GROUP BY clause. Since you mix grouped and non-grouped columns you have to provide a GROUP BY clause with all non-grouped columns that appear in the SELECT. That's true for any version of PostgreSQL. Read about GROUP BY and HAVING it in the manual.

    Starting with version 9.1, however, once you list a primary key in the GROUP BY you can skip additional columns for this table and still use them in the SELECT list. The release notes for version 9.1 tell us:

    Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause (Peter Eisentraut)

    Concerning parameters

    Do you intend to feed a constant value to an aggregate function? What's the point? The docs tell us

    An aggregate function computes a single result from multiple input rows.

    Or do you want those parameters to be column names? That kind of dynamic SQL works as long as the statement is generated before committing to the database. Does not work for prepared statements or simple sql or plpgsql functions. You have to use EXECUTE in a plpgsql function for that purpose.

    As safeguard against SQLi use the USING $1, $2 syntax for values and quote_ident() for your column or table names.

    0 讨论(0)
提交回复
热议问题