How to transform a key/value string into distinct rows?

前端 未结 2 1331
不思量自难忘°
不思量自难忘° 2021-01-23 01:10

I have a R dataset with key value strings which looks like below:

quest<-data.frame(city=c(\"Atlanta\",\"New York\",\"Atlanta\",\"Tampa\"), key_value=c(\"rev=         


        
相关标签:
2条回答
  • 2021-01-23 01:30

    We can use tidyverse. With separate_rows, split the 'key_value' by ; and expand the rows, then separate the column into two columns ('key', 'value' at =, expand the rows at | (separate_rows), grouped by 'city', 'key', get the sequence number (row_number()) and spread to 'wide' format

    library(tidyverse)
    separate_rows(quest, key_value, sep=";") %>% 
         separate(key_value, into = c("key", "value"), sep="=") %>% 
         separate_rows(value, sep="[|]", convert = TRUE) %>% 
         group_by(city, key) %>% 
         mutate(rn = row_number()) %>% 
         spread(key, value) %>%
         select(-rn)
    # A tibble: 7 x 4
    # Groups:   city [3]
    #      city   qty   rev   zip
    #*   <fctr> <dbl> <dbl> <dbl>
    #1  Atlanta     1  63.0 45987
    #2  Atlanta     1  12.0 74268
    #3 New York     1  10.6 12686
    #4 New York     2  34.0 12694
    #5    Tampa     1   3.0 33684
    #6    Tampa     6  24.0 36842
    #7    Tampa     3   8.0 30254
    
    0 讨论(0)
  • 2021-01-23 01:39

    Split by ;, then by = and |, and combine into a matrix, using the first part as the name. Then repeat the rows of the original data frame by however many rows were found for each, and combine. I don't convert here any columns to numeric, they're left as character.

    a <- strsplit(as.character(quest$key_value), ";")
    a <- lapply(a, function(x) {
        x <- do.call(cbind, strsplit(x, "[=|]"))
        colnames(x) <- x[1,]
        x[-1,,drop=FALSE]
    })
    b <- quest[rep(seq_along(a), sapply(a, nrow)), colnames(quest) != "key_value", drop=FALSE]
    out <- cbind(b, do.call(rbind, a), stringsAsFactors=FALSE)
    rownames(out) <- NULL
    out
    ##       city   rev qty   zip
    ## 1  Atlanta    63   1 45987
    ## 2 New York 10.60   1 12686
    ## 3 New York    34   2 12694
    ## 4  Atlanta    12   1 74268
    ## 5    Tampa     3   1 33684
    ## 6    Tampa    24   6 36842
    ## 7    Tampa     8   3 30254
    
    0 讨论(0)
提交回复
热议问题