Combine duplicate rows in column as comma separated values - Google Query

Deadly 提交于 2021-01-01 07:15:02

问题


If i have 2 columns viz., ID & Name, ID column containing duplicates, and if i want to group by ID to get unique ID's but name column should be a comma-separated list, can this be possible in Google Query?

| ID   | Name |
===============
| 1001 | abc  |
---------------
| 1001 | def  |
---------------
| 1002 | kjg  |
---------------
| 1003 | aof  |
---------------
| 1003 | lmi  |
---------------
| 1004 | xyz  |
---------------

into

| ID   | Name      |
====================
| 1001 | abc, def  |
--------------------
| 1002 | kjg       |
--------------------
| 1003 | aof, lmi  |
--------------------
| 1004 | xyz       |
--------------------

回答1:


try:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col2) 
  where Col1 is not null 
  group by Col1 
  pivot Col3"), 
 "select Col1 
  offset 1", 0), REGEXREPLACE(TRIM(
 TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B}, 
 "select max(Col2) 
  where Col1 is not null 
    and Col2 <> ',' 
  group by Col1 
  pivot Col3"), 
 "offset 1", 0)),,999^9))), ",$", )})

however, this may not work for massive datasets due to TRIM (which is needed to remove empty spaces) and REGEXREPLACE (which is needed to remove the end comma) limitations. otherwise, without it, the formula can handle anything:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col2) 
  where Col1 is not null 
  group by Col1 
  pivot Col3"), 
 "select Col1 
  offset 1", 0), 
 TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B}, 
 "select max(Col2) 
  where Col1 is not null 
    and Col2 <> ',' 
  group by Col1 
  pivot Col3"), 
 "offset 1", 0)),,999^9))})



回答2:


I looked through Query specification. I could not find a solution. So I made some formulas that do the job (because I found this task interesting).

D2 contains =unique(a2:a)

E2 contains =join(", ",transpose(filter($B$2:$B,$A$2:$A=D2)))and it's copied down.

I had to copy formulas down (far from beautiful formula) Hope you find it helpful.

Reference

  • UNIQUE
  • JOIN
  • TRANSPOSE
  • FILTER



回答3:


Here is an answer using QUERY.

=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(SPLIT(
 CONCATENATE(TRANSPOSE(QUERY({"♦"&A2:A&"♠", B2:B&", "}, 
 "select max(Col2) where Col2 is not null group by Col2 pivot Col1", 0))), 
 "♦")), "♠")), ",$", ))

This comes directly from this question. Player0 has answers with just amazing formulas that are able to reorganise data in a huge variety of ways.




回答4:


if you could live with the end-comma present in the output you can try:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1 
  pivot Col2"),
 "select Col1 offset 1", 0), 
 TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1 
  pivot Col2"),
 "select Col1 offset 1", 0), 
 QUERY(QUERY({A2:B, B2:B&","}, 
 "select Col1,max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1 
  pivot Col2"),
 "offset 1", 0), 
 SPLIT(TRANSPOSE(QUERY(TRANSPOSE(IF(QUERY(QUERY({A2:B, B2:B&","}, 
 "select max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1    
  pivot Col2"),
 "offset 1", 0)="",,COLUMN(B2:XXX)&",")),,999^99)), ","), 0))),,999^99))})

(tho this was never tested on an ultra-massive dataset but in theory, it should handle anything too)



来源:https://stackoverflow.com/questions/65169246/combine-duplicate-rows-in-column-as-comma-separated-values-google-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!