eliminate duplicate array values in postgres

后端 未结 8 1051
孤独总比滥情好
孤独总比滥情好 2020-12-01 01:27

I have an array of type bigint, how can I remove the duplicate values in that array?

Ex: array[1234, 5343, 6353, 1234, 1234]

I shou

相关标签:
8条回答
  • 2020-12-01 01:32

    ... Where the statandard libraries (?) for this kind of array_X utility??

    Try to search... See some but no standard:

    • postgres.cz/wiki/Array_based_functions: good reference!

    • JDBurnZ/postgresql-anyarray, good initiative but needs some collaboration to enhance.

    • wiki.postgresql.org/Snippets, frustrated initiative, but "offcial wiki", needs some collaboration to enhance.

    • MADlib: good! .... but it is an elephant, not an "pure SQL snippets lib".


    Simplest and faster array_distinct() snippet-lib function

    Here the simplest and perhaps faster implementation for array_unique() or array_distinct():

    CREATE FUNCTION array_distinct(anyarray) RETURNS anyarray AS $f$
      SELECT array_agg(DISTINCT x) FROM unnest($1) t(x);
    $f$ LANGUAGE SQL IMMUTABLE;
    

    NOTE: it works as expected with any datatype, except with array of arrays,

    SELECT  array_distinct( array[3,3,8,2,6,6,2,3,4,1,1,6,2,2,3,99] ), 
            array_distinct( array['3','3','hello','hello','bye'] ), 
            array_distinct( array[array[3,3],array[3,3],array[3,3],array[5,6]] );
     -- "{1,2,3,4,6,8,99}",  "{3,bye,hello}",  "{3,5,6}"
    

    the "side effect" is to explode all arrays in a set of elements.

    PS: with JSONB arrays works fine,

    SELECT array_distinct( array['[3,3]'::JSONB, '[3,3]'::JSONB, '[5,6]'::JSONB] );
     -- "{"[3, 3]","[5, 6]"}"
    

    Edit: more complex but useful, a "drop nulls" parameter

    CREATE FUNCTION array_distinct(
          anyarray, -- input array 
          boolean DEFAULT false -- flag to ignore nulls
    ) RETURNS anyarray AS $f$
          SELECT array_agg(DISTINCT x) 
          FROM unnest($1) t(x) 
          WHERE CASE WHEN $2 THEN x IS NOT NULL ELSE true END;
    $f$ LANGUAGE SQL IMMUTABLE;
    
    0 讨论(0)
  • 2020-12-01 01:35

    The sort(int[]) and uniq(int[]) functions are provided by the intarray contrib module.

    To enable its use, you must install the module.

    If you don't want to use the intarray contrib module, or if you have to remove duplicates from arrays of different type, you have two other ways.

    If you have at least PostgreSQL 8.4 you could take advantage of unnest(anyarray) function

    SELECT ARRAY(SELECT DISTINCT UNNEST('{1,2,3,2,1}'::int[]) ORDER BY 1);
     ?column? 
    ----------
     {1,2,3}
    (1 row)
    

    Alternatively you could create your own function to do this

    CREATE OR REPLACE FUNCTION array_sort_unique (ANYARRAY) RETURNS ANYARRAY
    LANGUAGE SQL
    AS $body$
      SELECT ARRAY(
        SELECT DISTINCT $1[s.i]
        FROM generate_series(array_lower($1,1), array_upper($1,1)) AS s(i)
        ORDER BY 1
      );
    $body$;
    

    Here is a sample invocation:

    SELECT array_sort_unique('{1,2,3,2,1}'::int[]);
     array_sort_unique 
    -------------------
     {1,2,3}
    (1 row)
    
    0 讨论(0)
  • 2020-12-01 01:35

    For people like me who still have to deal with postgres 8.2, this recursive function can eliminate duplicates without altering the sorting of the array

    CREATE OR REPLACE FUNCTION my_array_uniq(bigint[])
      RETURNS bigint[] AS
    $BODY$
    DECLARE
        n integer;
    BEGIN
    
        -- number of elements in the array
        n = replace(split_part(array_dims($1),':',2),']','')::int;
    
        IF n > 1 THEN
            -- test if the last item belongs to the rest of the array
            IF ($1)[1:n-1] @> ($1)[n:n] THEN
                -- returns the result of the same function on the rest of the array
                return my_array_uniq($1[1:n-1]);
            ELSE
                -- returns the result of the same function on the rest of the array plus the last element               
                return my_array_uniq($1[1:n-1]) || $1[n:n];
            END IF;
        ELSE
            -- if array has only one item, returns the array
            return $1;
        END IF;
    END;
    $BODY$
      LANGUAGE 'plpgsql' VOLATILE;
    

    for exemple :

    select my_array_uniq(array[3,3,8,2,6,6,2,3,4,1,1,6,2,2,3,99]);
    

    will give

    {3,8,2,6,4,1,99}
    
    0 讨论(0)
  • 2020-12-01 01:41

    In a single query i did this:

    SELECT (select array_agg(distinct val) from ( select unnest(:array_column) as val ) as u ) FROM :your_table;
    
    
    0 讨论(0)
  • 2020-12-01 01:42

    I faced the same. But an array in my case is created via array_agg function. And fortunately it allows to aggregate DISTINCT values, like:

      array_agg(DISTINCT value)
    

    This works for me.

    0 讨论(0)
  • 2020-12-01 01:43

    I have assembled a set of stored procedures (functions) to combat PostgreSQL's lack of array handling coined anyarray. These functions are designed to work across any array data-type, not just integers as intarray does: https://www.github.com/JDBurnZ/anyarray

    In your case, all you'd really need is anyarray_uniq.sql. Copy & paste the contents of that file into a PostgreSQL query and execute it to add the function. If you need array sorting as well, also add anyarray_sort.sql.

    From there, you can peform a simple query as follows:

    SELECT ANYARRAY_UNIQ(ARRAY[1234,5343,6353,1234,1234])

    Returns something similar to: ARRAY[1234, 6353, 5343]

    Or if you require sorting:

    SELECT ANYARRAY_SORT(ANYARRAY_UNIQ(ARRAY[1234,5343,6353,1234,1234]))

    Return exactly: ARRAY[1234, 5343, 6353]

    0 讨论(0)
提交回复
热议问题