I am using R to plot some data.
Date <- c(\"07/12/2012 05:00:00\", \"07/12/2012 06:00:00\", \"07/12/2012 07:00:00\",
\"07/12/2012 08:00:00\",\"07/
I think there is no way for R or ggplot2
to know if there is a missing data point somewhere, apart from you to specify it with an NA
. This way, for example :
df1 <- rbind(df1, list(strptime("07/12/2012 09:00:00", "%d/%m/%Y %H:%M"), NA))
ggplot(df1, aes(x=Date, y=Counts)) + geom_line(aes(group = 1))
You'll have to set group
by setting a common value to those points you'd like to be connected. Here, you can set the first 4 values to say 1
and the last 2 to 2
. And keep them as factors. That is,
df1$grp <- factor(rep(1:2, c(4,2)))
g <- ggplot(df1, aes(x=Date, y=Counts)) + geom_line(aes(group = grp)) +
geom_point()
Edit: Once you have your data.frame
loaded, you can use this code to automatically generate the grp
column:
idx <- c(1, diff(df$Date))
i2 <- c(1,which(idx != 1), nrow(df)+1)
df1$grp <- rep(1:length(diff(i2)), diff(i2))
Note: It is important to add geom_point()
as well because if the discontinuous range
happens to be the LAST entry in the data.frame, it won't be plotted (as there are not 2 points to connect the line). In this case, geom_point()
will plot it.
As an example, I'll generate a data with more gaps:
# get a test data
set.seed(1234)
df <- data.frame(Date=seq(as.POSIXct("05:00", format="%H:%M"),
as.POSIXct("23:00", format="%H:%M"), by="hours"))
df$Counts <- sample(19)
df <- df[-c(4,7,17,18),]
# generate the groups automatically and plot
idx <- c(1, diff(df$Date))
i2 <- c(1,which(idx != 1), nrow(df)+1)
df$grp <- rep(1:length(diff(i2)), diff(i2))
g <- ggplot(df, aes(x=Date, y=Counts)) + geom_line(aes(group = grp)) +
geom_point()
g
Edit: For your NEW data (assuming it is df
),
df$t <- strptime(paste(df$Date, df$Time), format="%d/%m/%Y %H:%M:%S")
idx <- c(10, diff(df$t))
i2 <- c(1,which(idx != 10), nrow(df)+1)
df$grp <- rep(1:length(diff(i2)), diff(i2))
now plot with aes(x=t, ...)
.
Juba's answer, to include explicit NA
's where you want breaks, is the best approach. Here is an alternate way to introduce those NA
's in the right place (without having to figure it out manually).
every.hour <- data.frame(Date=seq(min(Date), max(Date), by="1 hour"))
df2 <- merge(df1, every.hour, all=TRUE)
g %+% df2
You can do something similar with your later df
example, after changing the dates and times into a proper format
df$DateTime <- as.POSIXct(strptime(paste(df$Date, df$Time),
format="%m/%d/%Y %H:%M:%S"))
every.ten.seconds <- data.frame(DateTime=seq(min(df$DateTime),
max(df$DateTime), by="10 sec"))
df.10 <- merge(df, every.ten.seconds, all=TRUE)