问题
I have the following data set
color_code fav_color_code color_code_name fav_color_name
1|2 5 blue|white black
3|4 7|9 green|red pink|yellow
I need to join first value of color_code to first value of color_code_name and second value of color_code to second value of color_code_name etc..
code color
1 blue
2 white
5 black
3 green
4 red
7 pink
9 yellow
I am using the below code but it is doing cross join since I dont have id to join upon. This code work if I am mapping 2 columns but not multiple columns. Could someone help me to get the expected result?
SELECT
t1.code AS code,
t2.color AS color,
FROM
(
SELECT
c.value :: varchar AS code,
row_number() over(
order by
code
) AS rownum
FROM
table,
lateral flatten (
input => split(color_code, '|')
) c
UNION
SELECT
d.value :: varchar AS code,
row_number() OVER(
ORDER BY
code
) AS rownum
FROM
table,
lateral flatten (
input => split(fav_color_code, '|')
) d
) t1
JOIN (
SELECT
f.value :: varchar AS color,
row_number() OVER(
ORDER BY
color
) AS rownum
FROM
table,
lateral flatten (
input => split(color_code_name, '|')
) f
UNION
SELECT
g.value :: varchar AS color,
row_number() OVER(
ORDER BY
color
) AS rownum
FROM
table,
lateral flatten (
input => split(fav_color_name, '|')
) g
) t2 ON (t1.rownum = t2.rownum)
ORDER BY
t1.color
Thanks much!!
回答1:
You can follow this approach in several steps if you can, because I think in one step is a mess.
original data
+--------------------+------------------------+-------------------------+------------------------+--+
| colors.color_code | colors.fav_color_code | colors.color_code_name | colors.fav_color_name |
+--------------------+------------------------+-------------------------+------------------------+--+
| 1|2 | 5 | blue|white | black |
| 3|4 | 7|9 | green|red | pink|yellow |
+--------------------+------------------------+-------------------------+------------------------+--
First we create a temp table with color ids where we concatenate the code columns, split the column into an array and then explode the array with a rownumber
CREATE TABLE tc1 AS
SELECT ROW_NUMBER() OVER() AS rownum, CAST(color_id AS INT) as color_id
FROM colors
LATERAL VIEW EXPLODE(SPLIT(CONCAT(color_code,'|', fav_color_code),'\\|')) a1 AS color_id;
We create a second temp table with color names and we follow the approach as before but now we concatenate the color_name columns, split the column into an array and then explode the array with a rownumber
CREATE TABLE tc2 AS
SELECT ROW_NUMBER() OVER() AS rownum, color_name
FROM colors
LATERAL VIEW EXPLODE(SPLIT(CONCAT(color_code_name,'|', fav_color_name),'\\|')) a1 AS color_name;
we join temp tables by rownum
SELECT color_id, color_name
FROM tc1
JOIN tc2 ON(tc1.rownum = tc2.rownum)
ORDER BY color_id;
expected output
+-----------+-------------+--+
| color_id | color_name |
+-----------+-------------+--+
| 1 | blue |
| 2 | white |
| 3 | green |
| 4 | red |
| 5 | black |
| 7 | pink |
| 9 | yellow |
+-----------+-------------+--+
来源:https://stackoverflow.com/questions/61657212/join-columns-separated-by-delimiter-in-same-table