PostgreSQL - fetch the row which has the Max value for a column

前端 未结 9 820
感动是毒
感动是毒 2020-11-28 18:38

I\'m dealing with a Postgres table (called \"lives\") that contains records with columns for time_stamp, usr_id, transaction_id, and lives_remaining. I need a query that wil

相关标签:
9条回答
  • 2020-11-28 19:32

    Actaully there's a hacky solution for this problem. Let's say you want to select the biggest tree of each forest in a region.

    SELECT (array_agg(tree.id ORDER BY tree_size.size)))[1]
    FROM tree JOIN forest ON (tree.forest = forest.id)
    GROUP BY forest.id
    

    When you group trees by forests there will be an unsorted list of trees and you need to find the biggest one. First thing you should do is to sort the rows by their sizes and select the first one of your list. It may seems inefficient but if you have millions of rows it will be quite faster than the solutions that includes JOIN's and WHERE conditions.

    BTW, note that ORDER_BY for array_agg is introduced in Postgresql 9.0

    0 讨论(0)
  • 2020-11-28 19:33

    Here's another method, which happens to use no correlated subqueries or GROUP BY. I'm not expert in PostgreSQL performance tuning, so I suggest you try both this and the solutions given by other folks to see which works better for you.

    SELECT l1.*
    FROM lives l1 LEFT OUTER JOIN lives l2
      ON (l1.usr_id = l2.usr_id AND (l1.time_stamp < l2.time_stamp 
       OR (l1.time_stamp = l2.time_stamp AND l1.trans_id < l2.trans_id)))
    WHERE l2.usr_id IS NULL
    ORDER BY l1.usr_id;
    

    I am assuming that trans_id is unique at least over any given value of time_stamp.

    0 讨论(0)
  • 2020-11-28 19:37

    I like the style of Mike Woodhouse's answer on the other page you mentioned. It's especially concise when the thing being maximised over is just a single column, in which case the subquery can just use MAX(some_col) and GROUP BY the other columns, but in your case you have a 2-part quantity to be maximised, you can still do so by using ORDER BY plus LIMIT 1 instead (as done by Quassnoi):

    SELECT * 
    FROM lives outer
    WHERE (usr_id, time_stamp, trans_id) IN (
        SELECT usr_id, time_stamp, trans_id
        FROM lives sq
        WHERE sq.usr_id = outer.usr_id
        ORDER BY trans_id, time_stamp
        LIMIT 1
    )
    

    I find using the row-constructor syntax WHERE (a, b, c) IN (subquery) nice because it cuts down on the amount of verbiage needed.

    0 讨论(0)
提交回复
热议问题