How to add a sort key to an existing table in AWS Redshift

后端 未结 7 802
孤街浪徒
孤街浪徒 2021-01-01 09:24

In AWS Redshift, I want to add a sort key to a table that is already created. Is there any command which can add a column and use it as sort key?

相关标签:
7条回答
  • 2021-01-01 09:59

    To add to Yaniv's answer, the ideal way to do this is probably using the CREATE TABLE AS command. You can specify the distkey and sortkey explicitly. I.e.

    CREATE TABLE test_table_with_dist 
    distkey(field) 
    sortkey(sortfield) 
    AS 
    select * from test_table
    

    Additional examples:

    http://docs.aws.amazon.com/redshift/latest/dg/r_CTAS_examples.html

    EDIT

    I've noticed that this method doesn't preserve encoding. Redshift only automatically encodes during a copy statement. If this is a persistent table you should redefine the table and specify the encoding.

    create table test_table_with_dist(
        field1 varchar encode row distkey
        field2 timestam pencode delta sortkey);
    
    insert into test_table select * from test_table;
    

    You can figure out which encoding to use by running analyze compression test_table;

    0 讨论(0)
  • 2021-01-01 10:01

    Catching this query a bit late.
    I find that using 1=1 the best way to create and replicate data into another table in redshift eg: CREATE TABLE NEWTABLE AS SELECT * FROM OLDTABLE WHERE 1=1;

    then you can drop the OLDTABLE after verifying that the data has been copied

    (if you replace 1=1 with 1=2, it copies only the structure - which is good for creating staging tables)

    0 讨论(0)
  • 2021-01-01 10:15

    AWS now allows you to add both sortkeys and distkeys without having to recreate tables:

    TO add a sortkey (or alter a sortkey):

    ALTER TABLE data.engagements_bot_free_raw ALTER SORTKEY (id)

    To alter a distkey or add a distkey:

    ALTER TABLE data.engagements_bot_free_raw ALTER DISTKEY id

    Interestingly, the paranthesis are mandatory on SORTKEY, but not on DISTKEY.

    You still cannot inplace change the encoding of a table - that still requires the solutions where you must recreate tables.

    0 讨论(0)
  • 2021-01-01 10:15

    I followed this approach for adding the sort columns to my table table_transactons its more or less same approach only less number of commands.

    alter table table_transactions rename to table_transactions_backup;
    create table table_transactions compound sortkey(key1, key2, key3, key4) as select * from table_transactions_backup;
    drop table table_transactions_backup;
    
    0 讨论(0)
  • 2021-01-01 10:20

    As Yaniv Kessler mentioned, it's not possible to add or change distkey and sort key after creating a table, and you have to recreate a table and copy all data to the new table. You can use the following SQL format to recreate a table with a new design.

    ALTER TABLE test_table RENAME TO old_test_table;
    CREATE TABLE new_test_table([new table columns]);
    INSERT INTO new_test_table (SELECT * FROM old_test_table);
    ALTER TABLE new_test_table RENAME TO test_table;
    DROP TABLE old_test_table;
    

    In my experience, this SQL is used for not only changing distkey and sortkey, but also setting the encoding(compression) type.

    0 讨论(0)
  • 2021-01-01 10:22

    UPDATE:

    Amazon Redshift now enables users to add and change sort keys of existing Redshift tables without having to re-create the table. The new capability simplifies user experience in maintaining the optimal sort order in Redshift to achieve high performance as their query patterns evolve and do it without interrupting the access to the tables.

    source: https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-redshift-supports-changing-table-sort-keys-dynamically/

    At the moment I think its not possible (hopefully that will change in the future). In the past when I ran into this kind of situation I created a new table and copied the data from the old one into it.

    from http://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE.html:

    ADD [ COLUMN ] column_name Adds a column with the specified name to the table. You can add only one column in each ALTER TABLE statement.

    You cannot add a column that is the distribution key (DISTKEY) or a sort key (SORTKEY) of the table.

    You cannot use an ALTER TABLE ADD COLUMN command to modify the following table and column attributes:

    UNIQUE

    PRIMARY KEY

    REFERENCES (foreign key)

    IDENTITY

    The maximum column name length is 127 characters; longer names are truncated to 127 characters. The maximum number of columns you can define in a single table is 1,600.

    0 讨论(0)
提交回复
热议问题