Redshift. Convert comma delimited values into rows

后端 未结 8 1114
北恋
北恋 2020-12-01 06:25

I am wondering how to convert comma-delimited values into rows in Redshift. I am afraid that my own solution isn\'t optimal. Please advise. I have table with one of the colu

相关标签:
8条回答
  • 2020-12-01 07:07

    A slight improvement over the existing answer is to use a second "numbers" table that enumerates all of the possible list lengths and then use a cross join to make the query more compact.

    Redshift does not have a straightforward method for creating a numbers table that I am aware of, but we can use a bit of a hack from https://www.periscope.io/blog/generate-series-in-redshift-and-mysql.html to create one using row numbers.

    Specifically, if we assume the number of rows in cmd_logs is larger than the maximum number of commas in the user_action column, we can create a numbers table by counting rows. To start, let's assume there are at most 99 commas in the user_action column:

    select 
      (row_number() over (order by true))::int as n
    into numbers
    from cmd_logs
    limit 100;
    

    If we want to get fancy, we can compute the number of commas from the cmd_logs table to create a more precise set of rows in numbers:

    select
      n::int
    into numbers
    from
      (select 
          row_number() over (order by true) as n
       from cmd_logs)
    cross join
      (select 
          max(regexp_count(user_action, '[,]')) as max_num 
       from cmd_logs)
    where
      n <= max_num + 1;
    

    Once there is a numbers table, we can do:

    select
      user_id, 
      user_name, 
      split_part(user_action,',',n) as parsed_action 
    from
      cmd_logs
    cross join
      numbers
    where
      split_part(user_action,',',n) is not null
      and split_part(user_action,',',n) != '';
    
    0 讨论(0)
  • 2020-12-01 07:11

    You can get the expected result with the following query. I'm using "UNION ALL" to convert a column to row.

    select user_id, user_name, split_part(user_action,',',1) as parsed_action from cmd_logs
    union all
    select user_id, user_name, split_part(user_action,',',2) as parsed_action from cmd_logs
    union all
    select user_id, user_name, split_part(user_action,',',3) as parsed_action from cmd_logs
    
    0 讨论(0)
提交回复
热议问题