问题
I have a table called profile
, and I want to order them by which ones are the most filled out. Each of the columns is either a JSONB column or a TEXT column. I don't need this to a great degree of certainty, so typically I've ordered as follow:
SELECT * FROM profile ORDER BY LENGTH(CONCAT(profile.*)) DESC;
However, this is slow, and so I want to create an index. However, this does not work:
CREATE INDEX index_name ON profile (LENGTH(CONCAT(*))
Nor does
CREATE INDEX index_name ON profile (LENGTH(CONCAT(CAST(* AS TEXT))))
Can't say I'm surprised. What is the right way to declare this index?
回答1:
To measure the size of the row in text representation you can just cast the whole row to text, which is much faster than concatenating individual columns:
SELECT length(profile::text) FROM profile;
But there are 3 (or 4) issues with this expression in an index:
The syntax shorthand
profile::text
is not accepted inCREATE INDEX
, you need to add extra parentheses or default to the standard syntaxcast(profile AS text)
Still the same problem that @jjanes already discussed: only
IMMUTABLE
functions are allowed in index expressions and casting a row type totext
does not pass this requirement. You could build a fakeIMMUTABLE
wrapper function, like Jeff outlined.There is an inherent ambiguity (that applies to Jeff's answer as well!): if you have a column name that's the same as the table name (which is a common case) you cannot reference the row type in
CREATE INDEX
since the identifier always resolves to the column name first.Minor difference to your original: This adds column separators, row decorators and possibly escape characters to the
text
representation. Shouldn't matter much to your use case.
However, I would suggest a more radical alternative as crude indicator for the size of a row: pg_column_size(). Even shorter and faster and avoids issues 1, 3 and 4:
SELECT pg_column_size(profile) FROM profile;
Issue 2 remains, though: pg_column_size()
is also only STABLE
. You can create a simple and cheap SQL wrapper function:
CREATE OR REPLACE FUNCTION pg_column_size(profile)
RETURNS int LANGUAGE sql IMMUTABLE AS
'SELECT pg_catalog.pg_column_size($1)';
and then proceed like @jjanes outlined. More details:
- Does PostgreSQL support "accent insensitive" collations?
Note that I created the function with the row type profile
as parameter. Postgres allows function overloading, which is why we can use the same function name. Now, when we feed the matching row type to pg_column_size()
our custom function matches more closely according to function type resolution rules and is picked instead of the polymorphic system function. Alternatively, use a separate name and possibly make the function polymorphic as well ...
Related:
- Is there a way to disable function overloading in Postgres
回答2:
You can declare a function which is falsely marked "immutable" and build an index on that.
CREATE OR REPLACE FUNCTION len_immut(record)
RETURNS int
LANGUAGE plperl
IMMUTABLE
AS $function$
## This function lies about its immutability.
## Use it with care. It is useful for indexing
## entire table rows.
return length(join ",", values %{$_[0]});
$function$
and then
create index on profile (len_immut(profile));
SELECT * FROM profile ORDER BY len_immut(profile) DESC;
Since the function is falsely labelled as immutable
, the index may become out of date if you do things like add or drop columns on the table, or change the types of columns.
来源:https://stackoverflow.com/questions/34143116/postgresql-create-index-on-length-of-all-table-fields