Remove duplicates in string

后端 未结 3 837
清歌不尽
清歌不尽 2021-01-14 06:36

I have the following data set

df <- data.frame(
    path = c(\"a,b,a\", 
        \"(direct) / (none),   (direct) / (none), google / cpc,    google / cpc\"         


        
3条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-14 06:51

    You were almost there. The only thing is that you need to split with ",\\s*" instead of just ",". In the latter case, calling unique won't produce the wanted output, since some string may differ for the number of blank spaces. If you remove them when you split, you solve this issue.

    On another note, since you used setDT(df), I guess you are using data.table. If so, you need to use proper data.table grammar to avoid copies:

    df[,path:=sapply(
       strsplit(as.character(df$path ), split=",\\s*"), 
        function(x) {paste(unique(x), collapse = ', ')})]
    

    will modify the path column by reference.

提交回复
热议问题