How to filter a dataset by the time stamp

蓝咒 提交于 2020-01-16 05:30:16

问题


I'm working with some bird GPS tracking data, and I would like to exclude points based on the time stamp.

Some background information- the GPS loggers track each bird for just over 24 hours, starting in the evening, and continuing through the night and the following day. What I would like to do is exclude points taken after 9:30pm on the day AFTER deployment (so removing points from the very end of the track). As an R novice, I'm struggling because the deployment dates differ for each bird, so I can't simply use subset() for a specific date and time.

An example of my dataframe (df):

BirdID    x             y           Datetime
15K12     492719.9      5634805     2015-06-23 18:25:00
15K12     492491.5      5635018     2015-06-23 18:27:00
15K70     455979.1      5653581     2015-06-24 19:54:00  
15K70     456040.9      5653668     2015-06-24 19:59:00

So, pretending these points represent the start of the GPS track for each animal, I would like to remove points after 9:30 pm on June 24 for bird 15K12, and after 9:30 on June 25 for bird 15K70.

Any ideas?


回答1:


First, check if df$Datetime is a date variable:

class(df$Datetime)

If it's not, you can convert it with this:

df$Datetime <- ymd_hms(df&Datetime)

You use mutate to create a new variable called newdate that takes the earliest date of the bird's data and sets the date for cutoff which is the next day at 21:30:00 of the earliest date of a bird's observations.

Then you filter the Datetime column by the newdate column and you get the observations that are found earlier that the specified date.

library(dplyr); library(lubridate)
df %>% 
  group_by(BirdID) %>%
  mutate(newdate = as.POSIXct(date(min(Datetime)) + days(1) + hours(21) + minutes(30))) %>% 
  filter(Datetime < newdate)

Did a reproducible example:

library(dplyr); library(lubridate)

set.seed(1)

# Create a data frame (1000 observations)
BirdID <- paste(rep(floor(runif(250, 1, 20)),4),
  rep("k", 1000), rep(floor(runif(250, 1, 40)),4), sep = "")
x <- rnorm(1000, mean = 47000, sd = 2000)
y <- rnorm(1000, mean = 5650000, sd = 300000)
Datetime <- as.POSIXct(rnorm(1000, mean = as.numeric(as.POSIXct("2015-06-23 18:25:00")), sd = 99999), tz = "GMT", origin = "1970-01-01")
df <- data.frame(BirdID, x, y, Datetime, stringsAsFactors = FALSE)

# Filter the data frame by the specified date
df_filtered <- df %>% 
  group_by(BirdID) %>%
  mutate(newdate = as.POSIXct(date(min(Datetime)) + days(1) + hours(21) + minutes(30))) %>% 
  filter(Datetime < newdate)

This should fix any problem.



来源:https://stackoverflow.com/questions/37909597/how-to-filter-a-dataset-by-the-time-stamp

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!