ggplot to create multi line plot from csv file

前端 未结 1 871
Happy的楠姐
Happy的楠姐 2021-01-20 16:21

I am completely new to ggplot (and to some extent R). I have been blown away with the quality of graphs that can be created using ggplot, and I am trying to learn how to cre

1条回答
  •  一生所求
    2021-01-20 17:02

    dat <- structure(list(id = c(30L, 40L, 50L), f1 = c(0.841933670833, 
    1.47207692205, 0.823895293045), f2 = c(0.842101814883, 1.48713866811, 
    0.900091982861), f3 = c(0.842759547545, 1.48717177671, 0.900710334491
    ), f4 = c(1.88961562347, 1.48729643008, 0.901274168324), f5 = c(1.99808377527, 
    1.48743226992, 0.901413662472), f6 = c(0.841933670833, 1.48713866811, 
    0.901413662472)), .Names = c("id", "f1", "f2", "f3", "f4", "f5", 
    "f6"), class = "data.frame", row.names = c(NA, -3L))
    

    from here I would use melt. Read ?melt.data.frame for more info. But in one sentence, this takes data from a "wide" format to a "long" format.

    library(reshape2)
    dat.m <- melt(dat, id.vars='id')
    
    > dat.m
       id variable     value
    1  30       f1 0.8419337
    2  40       f1 1.4720769
    3  50       f1 0.8238953
    4  30       f2 0.8421018
    5  40       f2 1.4871387
    6  50       f2 0.9000920
    7  30       f3 0.8427595
    8  40       f3 1.4871718
    9  50       f3 0.9007103
    10 30       f4 1.8896156
    11 40       f4 1.4872964
    12 50       f4 0.9012742
    13 30       f5 1.9980838
    14 40       f5 1.4874323
    15 50       f5 0.9014137
    16 30       f6 0.8419337
    17 40       f6 1.4871387
    18 50       f6 0.9014137
    > 
    

    then plot however you'd like:

    ggplot(dat.m, aes(x=id, y=value, colour=variable)) + 
      geom_line() +
      geom_point(data=dat.m[dat.m$variable=='f2',], cex=2)
    

    Where aes defines the aesthetics such as the x value, y value, color/colour, etc. Then you add "layers". in the previous example I've added a line for what I defined in the ggplot() portion with geom_line() and added a point with geom_point where I only put them on the f2 variable.

    below, I added a smoothed line with geom_smooth(). See the documentation for a bit more info on what this is doing, ?geom_smooth.

    ggplot(dat.m, aes(x=id, y=value, colour=variable)) + 
      geom_smooth() + 
      geom_point(data=dat.m[dat.m$variable=='f2',], shape=3)
    

    or shapes for all. Here I put shape in the aesthetics of ggplot(). By putting them here they apply to all successive layers rather than having to specify them each time. However, I can overwrite the values supplied in ggplot() in any later layer:

    ggplot(dat.m, aes(x=id, y=value, colour=variable, shape=variable)) + 
      geom_smooth() + 
      geom_point() +
      geom_point(data=dat, aes(x=id, y=f2, color='red'), size=10, shape=2)
    

    However, a bit of ggplot understanding just takes time. Work through some of the examples given in the documentation and on the ggplot2 website. If your experience is anything like mine, after fighting with it for a few days or weeks it will eventually click. Regarding the data, if you assign your data to dat, the code will not change. dat <- read.csv(...). I don't use data as a variable because it is a built in function.

    0 讨论(0)
提交回复
热议问题