I am completely new to ggplot (and to some extent R). I have been blown away with the quality of graphs that can be created using ggplot, and I am trying to learn how to cre
dat <- structure(list(id = c(30L, 40L, 50L), f1 = c(0.841933670833,
1.47207692205, 0.823895293045), f2 = c(0.842101814883, 1.48713866811,
0.900091982861), f3 = c(0.842759547545, 1.48717177671, 0.900710334491
), f4 = c(1.88961562347, 1.48729643008, 0.901274168324), f5 = c(1.99808377527,
1.48743226992, 0.901413662472), f6 = c(0.841933670833, 1.48713866811,
0.901413662472)), .Names = c("id", "f1", "f2", "f3", "f4", "f5",
"f6"), class = "data.frame", row.names = c(NA, -3L))
from here I would use melt
. Read ?melt.data.frame
for more info. But in one sentence, this takes data from a "wide" format to a "long" format.
library(reshape2)
dat.m <- melt(dat, id.vars='id')
> dat.m
id variable value
1 30 f1 0.8419337
2 40 f1 1.4720769
3 50 f1 0.8238953
4 30 f2 0.8421018
5 40 f2 1.4871387
6 50 f2 0.9000920
7 30 f3 0.8427595
8 40 f3 1.4871718
9 50 f3 0.9007103
10 30 f4 1.8896156
11 40 f4 1.4872964
12 50 f4 0.9012742
13 30 f5 1.9980838
14 40 f5 1.4874323
15 50 f5 0.9014137
16 30 f6 0.8419337
17 40 f6 1.4871387
18 50 f6 0.9014137
>
then plot however you'd like:
ggplot(dat.m, aes(x=id, y=value, colour=variable)) +
geom_line() +
geom_point(data=dat.m[dat.m$variable=='f2',], cex=2)
Where aes
defines the aesthetics such as the x value, y value, color/colour, etc. Then you add "layers". in the previous example I've added a line for what I defined in the ggplot()
portion with geom_line()
and added a point with geom_point
where I only put them on the f2
variable.
below, I added a smoothed line with geom_smooth()
. See the documentation for a bit more info on what this is doing, ?geom_smooth
.
ggplot(dat.m, aes(x=id, y=value, colour=variable)) +
geom_smooth() +
geom_point(data=dat.m[dat.m$variable=='f2',], shape=3)
or shapes for all. Here I put shape in the aesthetics of ggplot()
. By putting them here they apply to all successive layers rather than having to specify them each time. However, I can overwrite the values supplied in ggplot()
in any later layer:
ggplot(dat.m, aes(x=id, y=value, colour=variable, shape=variable)) +
geom_smooth() +
geom_point() +
geom_point(data=dat, aes(x=id, y=f2, color='red'), size=10, shape=2)
However, a bit of ggplot
understanding just takes time. Work through some of the examples given in the documentation and on the ggplot2
website. If your experience is anything like mine, after fighting with it for a few days or weeks it will eventually click. Regarding the data, if you assign your data to dat
, the code will not change. dat <- read.csv(...)
. I don't use data
as a variable because it is a built in function.