问题
I'm trying to backfill a fully outerjoined table with nearest preceding column data.
The data frame I have looks like.. (No rows have both sides as NA and the table is sorted by date).
date X Y
2012-07-05 00:01:19 0.0122 NA
2012-07-05 03:19:34 0.0121 NA
2012-07-05 03:19:56 0.0121 0.027
2012-07-05 03:20:31 0.0121 NA
2012-07-05 04:19:56 0.0121 0.028
2012-07-05 04:20:31 0.0121 NA
2012-07-05 04:20:50 0.0121 NA
2012-07-05 04:22:29 0.0121 0.027
2012-07-05 04:24:37 0.0121 NA
2012-07-05 20:48:45 0.0121 NA
2012-07-05 23:02:34 NA 0.029
2012-07-05 23:30:45 NA 0.029
with this, I'm looking to..
- leave the non-data missing rows as it is.
- If either one side is missing (NA), then fill it with the "nearest preceding" row which has valid opposite side's value.
And so as the result, I would like to have the table looking like...
date X Y
2012-07-05 00:01:19 0.0122 NA
2012-07-05 03:19:34 0.0121 NA
2012-07-05 03:19:56 0.0121 0.027
2012-07-05 03:20:31 0.0121 0.027
2012-07-05 04:19:56 0.0121 0.028
2012-07-05 04:20:31 0.0121 0.028
2012-07-05 04:20:50 0.0121 0.028
2012-07-05 04:22:29 0.0121 0.027
2012-07-05 04:24:37 0.0121 0.027
2012-07-05 20:48:45 0.0121 0.027
2012-07-05 23:02:34 0.0121 0.029
2012-07-05 23:30:45 0.0121 0.029
What kind of R commands can I use to achieve this?
回答1:
Use na.locf
from the zoo
package
dat <- read.table(text="2012-07-05 00:01:19 0.0122 NA
2012-07-05 03:19:34 0.0121 NA
2012-07-05 03:19:56 0.0121 0.027
2012-07-05 03:20:31 0.0121 NA
2012-07-05 04:19:56 0.0121 0.028
2012-07-05 04:20:31 0.0121 NA
2012-07-05 04:20:50 0.0121 NA
2012-07-05 04:22:29 0.0121 0.027
2012-07-05 04:24:37 0.0121 NA
2012-07-05 20:48:45 0.0121 NA
2012-07-05 23:02:34 NA 0.029
2012-07-05 23:30:45 NA 0.029")
require("zoo")
na.locf(dat)
# V1 V2 V3 V4
#1 2012-07-05 00:01:19 0.0122 <NA>
#2 2012-07-05 03:19:34 0.0121 <NA>
#3 2012-07-05 03:19:56 0.0121 0.027
#4 2012-07-05 03:20:31 0.0121 0.027
#5 2012-07-05 04:19:56 0.0121 0.028
#6 2012-07-05 04:20:31 0.0121 0.028
#7 2012-07-05 04:20:50 0.0121 0.028
#8 2012-07-05 04:22:29 0.0121 0.027
#9 2012-07-05 04:24:37 0.0121 0.027
#10 2012-07-05 20:48:45 0.0121 0.027
#11 2012-07-05 23:02:34 0.0121 0.029
#12 2012-07-05 23:30:45 0.0121 0.029
回答2:
Maybe you should try this
> library(zoo)
> na.locf(Data)
x y
2012-07-17 0.0122 <NA>
2012-07-18 0.0121 <NA>
2012-07-19 0.0121 0.027
2012-07-20 0.0121 0.027
2012-07-21 0.0121 0.028
2012-07-22 0.0121 0.028
2012-07-23 0.0121 0.028
2012-07-24 0.0121 0.027
2012-07-25 0.0121 0.027
2012-07-26 0.0121 0.027
2012-07-27 0.0121 0.029
2012-07-28 0.0121 0.029
where Data
is your data.frame
来源:https://stackoverflow.com/questions/11512147/fill-missing-values-in-the-data-frame-with-the-data-from-the-same-data-frame