How can I transform an array of characters with a few lines of code to a data.frame?

℡╲_俬逩灬. 提交于 2019-12-23 19:19:57

问题


I have the following array
my_list <- c("Jan-01--Dec-31|00:00--24:00", "Jan-01--Jun-30|12:00--18:00", "Jul-06--Dec-31|09:00--19:00")

What is the shortest code which results in:

  x1     x2     x3
1 Jan-01 Jan-01 Jul-06
2 Dec-31 Jun-30 Dec-31

and

  x2    x2    x3
1 00:00 12:00 09:00
2 24:00 18:00 19:00

At the moment I have the (not very nice) code

df <- as.data.frame(strsplit(my_list, split = "|", fixed = T),
                    stringsAsFactors = F)
date_list <- strsplit(as.character(df[1, ]), split = "--", fixed = T)
date_df <- as.data.frame(date_list, col.names = c(1:length(date_list)),
                         stringsAsFactors = F)
time_list <- strsplit(as.character(df[2, ]), split = "--", fixed = T)
time_df <- as.data.frame(time_list, col.names = c(1:length(date_list)),
                         stringsAsFactors = F)

The best thing I have up to now is

date_list <- sapply(strsplit(schedule$schedule, split = "|", fixed = T), "[", 1)
date_df <- t(data.frame(x1=sapply(strsplit(df1, split = "--", fixed = T), "[", 1),
                   x2=sapply(strsplit(df1, split = "--", fixed = T), "[", 2),
                   stringsAsFactors = F))
# and similarly for time_list and time_df.

Is there something more elegant?


回答1:


tstrsplit from data.table package and str_split_fixed from stringr are pretty useful functions to get correct shaped data when splitting vectors of strings; The former provides transpose of the splitted string which allows you to extract the date and time separately without using apply function and the latter split strings into matrix with specified columns:

library(data.table); library(stringr)
lapply(tstrsplit(my_list, "\\|"), function(s) t(str_split_fixed(s, "--", 2)))

#[[1]]
#     [,1]     [,2]     [,3]    
#[1,] "Jan-01" "Jan-01" "Jul-06"
#[2,] "Dec-31" "Jun-30" "Dec-31"

#[[2]]
#     [,1]    [,2]    [,3]   
#[1,] "00:00" "12:00" "09:00"
#[2,] "24:00" "18:00" "19:00"



回答2:


my_results <- sapply(strsplit(my_list,"|",fixed=T),function(x) strsplit(x,"--",fixed=T))
my_dates <- t(Reduce("rbind",myresults[1,]))
my_times <- t(Reduce("rbind",myresults[2,]))



回答3:


strsplit accepts a greppish pattern that can do the split in one pass. Then can use lapply (or sapply) and finish up with setNames.

 setNames( data.frame(lapply( strsplit( my_vec, split="\\-\\-|\\|"),  "[", 1:2) ), paste0("x",1:3) )

      x1     x2     x3
1 Jan-01 Jan-01 Jul-06
2 Dec-31 Jun-30 Dec-31

Obviously the times could be handled by substituting 3:4 for 1:2 in the code above.




回答4:


One more alternative using stringr:

library(stringr)
a <- t(str_split_fixed(my_list, "\\||--", 4))

#     [,1]     [,2]     [,3]    
#[1,] "Jan-01" "Jan-01" "Jul-06"
#[2,] "Dec-31" "Jun-30" "Dec-31"
#[3,] "00:00"  "12:00"  "09:00" 
#[4,] "24:00"  "18:00"  "19:00" 

To get the final output, data.frame(a[1:2,]) and data.frame(a[3:4,])

Update

my_list <- "Jan-01--Dec-31|00:00--24:00"
a <- t(str_split_fixed(my_list, "\\||--", 4))

     [,1]    
[1,] "Jan-01"
[2,] "Dec-31"
[3,] "00:00" 
[4,] "24:00"

data.frame(a[1:2,])

  a.1.2...
1   Jan-01
2   Dec-31

data.frame(a[3:4,])

  a.3.4...
1    00:00
2    24:00



回答5:


Here is a base R option

lst <- strsplit(scan(text=my_list, sep="|", what ="", quiet=TRUE), "--")
do.call(cbind, lst[c(TRUE, FALSE)])
#     [,1]     [,2]     [,3]    
#[1,] "Jan-01" "Jan-01" "Jul-06"
#[2,] "Dec-31" "Jun-30" "Dec-31"

do.call(cbind, lst[c(FALSE, TRUE)])
#     [,1]    [,2]    [,3]   
#[1,] "00:00" "12:00" "09:00"
#[2,] "24:00" "18:00" "19:00"

Or in a single line base R option

lapply(split(scan(text=my_list, sep="|", what ="", quiet=TRUE), 1:2), 
                      function(x) do.call(cbind, strsplit(x, "--")))
#$`1`
#     [,1]     [,2]     [,3]    
#[1,] "Jan-01" "Jan-01" "Jul-06"
#[2,] "Dec-31" "Jun-30" "Dec-31"

#$`2`
#    [,1]    [,2]    [,3]   
#[1,] "00:00" "12:00" "09:00"
#[2,] "24:00" "18:00" "19:00"


来源:https://stackoverflow.com/questions/38797026/how-can-i-transform-an-array-of-characters-with-a-few-lines-of-code-to-a-data-fr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!