可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am looking to calculate occupancy in emergency department (ED) with tidyverse. Occupancy is understood here in this particular problem as Admitted but did not leave the hospital within the same hour they were admitted. A clearer example would be: if I came at ED at 12:00:00 and did not leave within the hour I was admitted, then I am occupying the bospital. So for this I need to create a new column Occupancy. (A little insight to give - I want to plot occupancy by hour of the day. Yet I know how to plot this, but do not know how to calculate occupancy. Thus no need for you to be bogged down on this issue as I am giving you an insight of my project). What I need though is to learn how to calculate occupancy from the table I have bellow. Please do help.
I have ID, Admission = Adm and Disc = Discharges.
ID = c(101, 102,103, 104, 105, 106, 107) Adm = as.POSIXct(c("2012-01-12 00:52:00", "2012-01-12 00:55:00", "2012-02-12 01:35:00", "2012-02-12 03:24:00", "2012-02-12 04:24:00", "2012-02-12 05:24:00", "2012-02-12 05:28:00")) Disc = as.POSIXct(c("2012-01-12 02:00:00", "2012-01-12 02:59:00", "2012-01-12 03:01:00", "2012-01-12 05:01:00", "2012-01-12 06:01:00", "2012-01-12 08:01:00", "2012-01-12 08:01:00")) df = data.frame(ID, Adm, Disc)
I have extracted the hour from the Admission. So that I can use the new column for calculating the occupancy - understood at the problem at hand as Admitted but were not discharged within the hour the patients were admitted. To remind you, I want to do this with tidyverse library
df_hour <- df %>% mutate(Hour_Adm = lubridate::hour(as.POSIXct(Adm, "%Y%m%d %H:%M:%S")))
Any help is very much appreciated. Thank you.
回答1:
Logic is to add 1 hour (i.e. 60*60
seconds) to Adm
time (which is of POSIXct
type) and compare it with Disc
time.
First
& last
is added for cases wherein multiple rows are there for an ID
. Then the earliest Adm
and latest Disc
time will only be considered per ID
.
library(tidyverse) df %>% group_by(ID) %>% mutate(occupancy = ifelse(last(Disc) > first(Adm) + 60*60, 1, 0))
which gives
ID Adm Disc occupancy <dbl> <dttm> <dttm> <dbl> 1 101 2012-01-12 00:52:00 2012-01-12 02:00:00 1.00 2 102 2012-01-12 00:55:00 2012-01-12 02:59:00 1.00 3 103 2012-02-12 01:35:00 2012-01-12 03:01:00 0 4 104 2012-02-12 03:24:00 2012-01-12 05:01:00 0 5 105 2012-02-12 04:24:00 2012-01-12 06:01:00 0 6 106 2012-02-12 05:24:00 2012-01-12 08:01:00 0 7 107 2012-02-12 05:28:00 2012-01-12 08:01:00 0
Sample data:
df <- structure(list(ID = c(101, 102, 103, 104, 105, 106, 107), Adm = structure(c(1326309720, 1326309900, 1328990700, 1328997240, 1329000840, 1329004440, 1329004680 ), class = c("POSIXct", "POSIXt"), tzone = ""), Disc = structure(c(1326313800, 1326317340, 1326317460, 1326324660, 1326328260, 1326335460, 1326335460 ), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("ID", "Adm", "Disc"), row.names = c(NA, -7L), class = "data.frame")
回答2:
We can try
library(dplyr) library(lubridate) df %>% group_by(ID) %>% mutate(`Stay In (Hours)` = hour(Disc) - hour(Adm), Occupancy = ifelse(hour(Disc) - hour(Adm) > 1, 1, 0)) %>% ungroup() #But notice that `hour` consider the hour's part of the time only as shown below, which may lead to misleading results: hour(as.POSIXct(c("2012-01-12 01:40:00"))) - hour(as.POSIXct(c("2012-01-12 00:50:00"))) [1] 1
The correct answer I hope so:
df %>% group_by(ID) %>% mutate(`Stay In (Hours)` = round(difftime(Disc, Adm, units='hours'),2), Occupancy = ifelse(difftime(Disc, Adm, units='hours') > 1, 1, 0)) %>% ungroup() # A tibble: 7 x 5 ID Adm Disc `Stay In (Hours)` Occupancy <dbl> <dttm> <dttm> <time> <dbl> 1 101 2012-01-12 00:52:00 2012-01-12 02:00:00 1.13 1.00 2 102 2012-01-12 00:55:00 2012-01-12 02:59:00 2.07 1.00 3 103 2012-01-12 01:35:00 2012-02-12 03:01:00 745.43 1.00 4 104 2012-01-12 03:24:00 2012-02-12 05:01:00 745.62 1.00 5 105 2012-01-12 04:24:00 2012-02-12 06:01:00 745.62 1.00 6 106 2012-01-12 05:24:00 2012-02-12 08:01:00 746.62 1.00 7 107 2012-01-12 05:28:00 2012-02-12 08:01:00 746.55 1.00