问题
My data frame contains date values in the format YYYY-MM-DD HH-MM-SS
across 125000+ rows, broken down by the minute (each row represents a single minute).
1 2018-01-01 00:04:00
2 2018-01-01 00:05:00
3 2018-01-01 00:06:00
4 2018-01-01 00:07:00
5 2018-01-01 00:08:00
6 2018-01-01 00:09:00
...
124998 2018-03-29 05:07:00
124999 2018-03-29 05:08:00
125000 2018-03-29 05:09:00
I want to subset the data by extracting all of the minute values within any given hour and saving the results into individual data frames.
I have used subset()
combined with grepl()
to no avail. I have tried setting start =
and stop =
parameters but also to no avail.
What I want to do is for every HH
value, I want to extract all rows with corresponding HH
values and then create a new data frame for each respective HH
value.
For example, I would like to have a data frame that corresponds to every minute's values (the full hour's worth of data), resulting in data frames such as:
2018-01-01 00:00:00
(contains data from2018-01-01 00:00:00
to2018-01-01 00:59:00
(inclusive))2018-01-01 01:00:00
(contains data from2018-01-01 01:00:00
to2018-01-01 01:59:00
(inclusive))
and so on.
Is there a quick way to achieve this or is it a more laborious task?
Note: I am aware that my desired result will produce a lot of data frames, and that is fine for my particular project as I will only be working on a single one-hour block at any one time.
回答1:
This will produce a list of data frames grouped by each hour, assuming your data frame is called data
and your first column is V1
split(data, format(data$V1, "%Y-%m-%d %H"))
回答2:
I have come up with a solution which extracts every minute (MM
) value/row from the main data frame:
df <- buckets[grepl("00:\\d+:00$", buckets$time), ]
To separate it for each hour, I will simply change the first 00
depending on which hour I want to focus on and I can then perform a similar function to extract each individual date value.
回答3:
If you want to access each individual date value, lubridate
has default functions for that.
library(lubridate)
data %>% mutate(year = year(x), month = month(x), day = day(x), hour = hour(x))
So you can get the same splits (but in a more cumbersome manner) by doing:
data %>% mutate(year = year(x), month = month(x), day = day(x), hour = hour(x)) %>%
group_by(year, month, day, hour) %>%
split(list(.$year, .$month, .$day, .$hour))
The dummy data
x <- seq(as.POSIXct("2018-01-01 00:00:00"), as.POSIXct("2018-01-04 59:59:59"), length.out = 1000)
data <- data.frame(x)
来源:https://stackoverflow.com/questions/49669327/extract-subset-minute-values-from-each-hour