Transforming a min() and a max() value into a range of values

|▌冷眼眸甩不掉的悲伤 提交于 2020-01-06 18:34:28

问题


I would like stretch some of my records that were flattened

I have a table like this

Store     Min(Date)     Max (Date)     Status

NYC1       1/1/2013      2/1/2013      Open
NYC1       2/2/2013      2/3/2013      Closed for Inspection
Boston1    1/1/2013      2/5/2013      Open

and I would like to stretch it into the form

Store       Date        Status

 NYC1       1/1/2013     Open
 NYC1       1/2/2013     Open
 .....
 NYC1       2/2/2013     Closed for Inspection
 NYC1       2/3/2013     Closed for Inspection
 ....
 Boston1    1/1/2013     Open

I know I can always write loops for this, but before attempting so, I would like to ask if there's any quick and dirty way of doing it?


回答1:


Here's one approach:

Read in your data and convert your dates to actual date variables:

mydf <- read.table(header = TRUE, stringsAsFactors=FALSE, 
text = "Store     Min(Date)     Max(Date)     Status
NYC1       1/1/2013      2/1/2013      Open
NYC1       2/2/2013      2/3/2013      'Closed for Inspection'
Boston1    1/1/2013      2/5/2013      Open")

names(mydf) <- c("store", "min.date", "max.date", "status")
mydf$min.date <- as.Date(mydf$min.date, format = "%m/%d/%Y")
mydf$max.date <- as.Date(mydf$max.date, format = "%m/%d/%Y")
mydf
#     store   min.date   max.date                status
# 1    NYC1 2013-01-01 2013-02-01                  Open
# 2    NYC1 2013-02-02 2013-02-03 Closed for Inspection
# 3 Boston1 2013-01-01 2013-02-05                  Open

Calculate the difference in days between "min.date" and "max.date"

Use that information to "expand" your data.frame and to generate the date sequence between "min.date" and "max.date". Also, subset the data.frame to return just the "store", "date" (our new variable), and "status" variables.

SEQ <- mydf$max.date - mydf$min.date + 1
mydf2 <- mydf[rep(row.names(mydf), SEQ), ]
mydf2$date <- mydf2$min.date + sequence(SEQ)-1

mydf2 <- mydf2[c("store", "date", "status")]

Here's a sample of the output.

head(mydf2)
#     store       date status
# 1    NYC1 2013-01-01   Open
# 1.1  NYC1 2013-01-02   Open
# 1.2  NYC1 2013-01-03   Open
# 1.3  NYC1 2013-01-04   Open
# 1.4  NYC1 2013-01-05   Open
# 1.5  NYC1 2013-01-06   Open
tail(mydf2)
#        store       date status
# 3.30 Boston1 2013-01-31   Open
# 3.31 Boston1 2013-02-01   Open
# 3.32 Boston1 2013-02-02   Open
# 3.33 Boston1 2013-02-03   Open
# 3.34 Boston1 2013-02-04   Open
# 3.35 Boston1 2013-02-05   Open

You can use by to verify we did everything correctly:

> with(mydf2, by(date, list(store, status), FUN = range))
: Boston1
: Closed for Inspection
NULL
----------------------------------------------------------------- 
: NYC1
: Closed for Inspection
[1] "2013-02-02" "2013-02-03"
----------------------------------------------------------------- 
: Boston1
: Open
[1] "2013-01-01" "2013-02-05"
----------------------------------------------------------------- 
: NYC1
: Open
[1] "2013-01-01" "2013-02-01"



回答2:


Using data.table for syntax elegance (and assuming the preprocessing by @Ananda

mydf <- read.table(header = TRUE, stringsAsFactors=FALSE, 
text = "Store     Min(Date)     Max(Date)     Status
NYC1       1/1/2013      2/1/2013      Open
NYC1       2/2/2013      2/3/2013      'Closed for Inspection'
Boston1    1/1/2013      2/5/2013      Open")

names(mydf) <- c("store", "min.date", "max.date", "status")
mydf$min.date <- as.Date(mydf$min.date, format = "%m/%d/%Y")
mydf$max.date <- as.Date(mydf$max.date, format = "%m/%d/%Y")

library(data.table)
DT <- data.table(mydf)
DT[, list(dates = seq(min.date,max.date, by = 1)) , by = list(store,status)]



回答3:


Green Demon

Given your question has the reshape package tag, the easiest thing I can think of is to simply use the melt function. Let's call your data.frame 'foo'. The code below should give you what you want.

library(reshape)
foo.melt<-melt(foo, id.vars=c('Store','Status'))

Note that this will create an additional column 'variable' with min.date and max.date.

Cheers,

Danny



来源:https://stackoverflow.com/questions/16153920/transforming-a-min-and-a-max-value-into-a-range-of-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!