Redshift. Convert comma delimited values into rows

后端未结

关注

 8  1115

I am wondering how to convert comma-delimited values into rows in Redshift. I am afraid that my own solution isn\'t optimal. Please advise. I have table with one of the colu

相关标签:

8条回答

生来不讨喜

2020-12-01 06:51

Another idea is to transform your CSV string into JSON first, followed by JSON extract, along the following lines:

... '["' || replace( user_action, '.', '", "' ) || '"]' AS replaced

... JSON_EXTRACT_ARRAY_ELEMENT_TEXT(replaced, numbers.i) AS parsed_action

Where "numbers" is the table from the first answer. The advantage of this approach is the ability to use built-in JSON functionality.

0 讨论(0)
发布评论:

提交评论
- 加载中...

南方客

2020-12-01 06:51

Late to the party but I got something working (albeit very slow though)

with nums as (select n::int n
from
  (select 
      row_number() over (order by true) as n
   from table_with_enough_rows_to_cover_range)
cross join
  (select 
      max(json_array_length(json_column)) as max_num 
   from table_with_json_column )
where
  n <= max_num + 1)
select *, json_extract_array_element_text(json_column,nums.n-1) parsed_json
from  nums, table_with_json_column
where json_extract_array_element_text(json_column,nums.n-1) != ''
and nums.n <= json_array_length(json_column)

Thanks to answer by Bob Baxley for inspiration

0 讨论(0)

悲&欢浪女

2020-12-01 06:52
You can try copy command to copy your file into redshift tables
```
copy table_name from 's3://mybucket/myfolder/my.csv' CREDENTIALS 'aws_access_key_id=my_aws_acc_key;aws_secret_access_key=my_aws_sec_key' delimiter ','
```
You can use delimiter ',' option.

For more details of copy command options you can visit this page

http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html
0 讨论(0)
发布评论:

提交评论
- 加载中...

执笔经年

2020-12-01 07:02

Just improvement for the answer above https://stackoverflow.com/a/31998832/1265306

Is generating numbers table using the following SQL https://discourse.looker.com/t/generating-a-numbers-table-in-mysql-and-redshift/482

SELECT 
  p0.n 
  + p1.n*2 
  + p2.n * POWER(2,2) 
  + p3.n * POWER(2,3)
  + p4.n * POWER(2,4)
  + p5.n * POWER(2,5)
  + p6.n * POWER(2,6)
  + p7.n * POWER(2,7) 
  as number  
INTO numbers
FROM  
  (SELECT 0 as n UNION SELECT 1) p0,  
  (SELECT 0 as n UNION SELECT 1) p1,  
  (SELECT 0 as n UNION SELECT 1) p2, 
  (SELECT 0 as n UNION SELECT 1) p3,
  (SELECT 0 as n UNION SELECT 1) p4,
  (SELECT 0 as n UNION SELECT 1) p5,
  (SELECT 0 as n UNION SELECT 1) p6,
  (SELECT 0 as n UNION SELECT 1) p7
ORDER BY 1
LIMIT 100

"ORDER BY" is there only in case you want paste it without the INTO clause and see the results

0 讨论(0)

醉话见心

2020-12-01 07:03
Here's my equally-terrible answer.

I have a users table, and then an events table with a column that is just a comma-delimited string of users at said event. eg
```
event_id | user_ids
1        | 5,18,25,99,105
```
In this case, I used the LIKE and wildcard functions to build a new table that represents each event-user edge.
```
SELECT e.event_id, u.id as user_id
FROM events e
LEFT JOIN users u ON e.user_ids like '%' || u.id || '%'
```
It's not pretty, but I throw it in a WITH clause so that I don't have to run it more than once per query. I'll likely just build an ETL to create that table every night anyway.

Also, this only works if you have a second table that does have one row per unique possibility. If not, you could do LISTAGG to get a single cell with all your values, export that to a CSV and reupload that as a table to help.

Like I said: a terrible, no-good solution.
0 讨论(0)
发布评论:

提交评论
- 加载中...

南旧

2020-12-01 07:05

create a stored procedure that will parse string dynamically and populatetemp table, select from temp table.

here is the magic code:-

  CREATE OR REPLACE PROCEDURE public.sp_string_split( "string" character varying )
AS $$
DECLARE 
  cnt INTEGER := 1;
    no_of_parts INTEGER := (select REGEXP_COUNT ( string , ','  ));
    sql VARCHAR(MAX) := '';
    item character varying := '';
BEGIN

  -- Create table
  sql := 'CREATE TEMPORARY TABLE IF NOT EXISTS split_table (part VARCHAR(255)) ';
  RAISE NOTICE 'executing sql %', sql ;
  EXECUTE sql;

  <<simple_loop_exit_continue>>
  LOOP
    item = (select split_part("string",',',cnt)); 
    RAISE NOTICE 'item %', item ;
    sql := 'INSERT INTO split_table SELECT '''||item||''' ';
    EXECUTE sql;
    cnt = cnt + 1;
    EXIT simple_loop_exit_continue WHEN (cnt >= no_of_parts + 2);
  END LOOP;

END ;
$$ LANGUAGE plpgsql;

Usage example:-

  call public.sp_string_split('john,smith,jones');
select *
from split_table

0 讨论(0)

1 2 下一页