This question is about the functionality of first_value(), using another function or workaround.
It is also about \"little gain in performance\" in big tables. To use eg
If you really don't care which member of the set is picked, and if you don't need to compute additional aggregates (like count), there is a fast and simple alternative with DISTINCT ON (x)
without ORDER BY
:
SELECT DISTINCT ON (x) x, y, z FROM t;
x
, y
and z
are from the same row, but the row is an arbitrary pick from each set of rows with the same x
.
If you need a count anyway, your options with regard to performance are limited since the whole table has to be read in either case. Still, you can combine it with window functions in the same SELECT
:
SELECT DISTINCT ON (x) x, y, z, count(*) OVER (PARTITION BY x) AS x_count FROM t;
Consider the sequence of events in a SELECT
query:
Depending on requirements, there may be faster ways to get counts:
In combination with GROUP BY
the only realistic option I see to gain some performance is the first_last_agg extension. But don't expect much.
For other use cases without count (including the simple case at the top), there are faster solutions, depending on your exact use case. In particular to get "first" or "last" value of each set. Emulate a loose index scan. (Like @Mihai commented):