I have a table with 4 array columns.. the results are like:
ids signed_ids new_ids new_ids_signed
{1,2,3} | {2,1,3} | {4,5,6} | {6,5,4}
<
The additional module intarray provides operators for arrays of integer
, which are typically (much) faster. Install once per database with (in Postgres 9.1 or later):
CREATE EXTENSION intarray;
Then you can:
SELECT uniq(sort(ids)) = uniq(sort(signed_ids));
Or:
SELECT ids @> signed_ids AND ids <@ signed_ids;
Bold emphasis on functions and operators from intarray.
In the second example, operator resolution arrives at the specialized intarray operators if left and right argument are type integer[]
.
Both expressions will ignore order and duplicity of elements. Further reading in the helpful manual here.
intarray
operators only work for arrays of integer
(int4
), not bigint
(int8
) or smallint
(int2
) or any other data type.
Unlike the default generic operators, intarray
operators do not accept NULL values in arrays. NULL in any involved array raises an exception. If you need to work with NULL values, you can default to the standard, generic operators by schema-qualifying the operator with the OPERATOR
construct:
SELECT ARRAY[1,4,null,3]::int[] OPERATOR(pg_catalog.@>) ARRAY[3,1]::int[]
The generic operators can't use indexes with an intarray
operator class and vice versa.
Related:
The simplest thing to do is sort them and compare them sorted. See sorting arrays in PostgreSQL.
Given sample data:
CREATE TABLE aa(ids integer[], signed_ids integer[]);
INSERT INTO aa(ids, signed_ids) VALUES (ARRAY[1,2,3], ARRAY[2,1,3]);
the best thing to do is to if the array entries are always integers is to use the intarray extension, as Erwin explains in his answer. It's a lot faster than any pure-SQL formulation.
Otherwise, for a general version that works for any data type, define an array_sort(anyarray)
:
CREATE OR REPLACE FUNCTION array_sort(anyarray) RETURNS anyarray AS $$
SELECT array_agg(x order by x) FROM unnest($1) x;
$$ LANGUAGE 'SQL';
and use it sort and compare the sorted arrays:
SELECT array_sort(ids) = array_sort(signed_ids) FROM aa;
There's an important caveat:
SELECT array_sort( ARRAY[1,2,2,4,4] ) = array_sort( ARRAY[1,2,4] );
will be false. This may or may not be what you want, depending on your intentions.
Alternately, define a function array_compare_as_set
:
CREATE OR REPLACE FUNCTION array_compare_as_set(anyarray,anyarray) RETURNS boolean AS $$
SELECT CASE
WHEN array_dims($1) <> array_dims($2) THEN
'f'
WHEN array_length($1,1) <> array_length($2,1) THEN
'f'
ELSE
NOT EXISTS (
SELECT 1
FROM unnest($1) a
FULL JOIN unnest($2) b ON (a=b)
WHERE a IS NULL or b IS NULL
)
END
$$ LANGUAGE 'SQL' IMMUTABLE;
and then:
SELECT array_compare_as_set(ids, signed_ids) FROM aa;
This is subtly different from comparing two array_sort
ed values. array_compare_as_set
will eliminate duplicates, making array_compare_as_set(ARRAY[1,2,3,3],ARRAY[1,2,3])
true, whereas array_sort(ARRAY[1,2,3,3]) = array_sort(ARRAY[1,2,3])
will be false.
Both of these approaches will have pretty bad performance. Consider ensuring that you always store your arrays sorted in the first place.
You can use contained by operator:
(array1 <@ array2 and array1 @> array2)
If your arrays have no duplicates and are of the same dimension:
@>
array_length
where the length must match the size you want on both sidesselect (string_agg(a,',' order by a) = string_agg(b,',' order by b)) from (select unnest(array[1,2,3,2])::text as a,unnest(array[2,2,3,1])::text as b) A