line graph with 2 categorical variables and 1 continuous in R

牧云@^-^@ 提交于 2019-12-24 04:18:40

问题


I'm quite new to R and statistics in general. I am trying to plot in a line graph 2 categorical variables (part of speech "pos", condition "trcond") and a numerical one (score "totacc") in ggplot2.

> df1<-df[, c("trcond", "subtitle", "pos", "totacc")]
> head(df1)
   trcond     subtitle pos totacc
7       L New Scene_16 lex  0.250
29      N New Scene_16 lex  0.500
8       L New Scene_25 lex  0.875
30      N New Scene_25 lex  0.666
9       L New Scene_29 lex  1.000
31      N New Scene_29 lex  0.833

I have used this ggplot2 command:

>ggplot(data=summdfo, aes(x=pos, y=totacc, group=trcond, colour=trcond))
+ geom_line() + geom_point()

But it is not working, the graph has coloured (blue and red) dots all over the place and more than just two lines linking them. I would like to post the graph I get as I lack words to explain but this is my first post and I don't seem to be able to upload pictures.

I would like to get a standard simple 2-line graph such as the blue and red ones in this page (where y=total bill, by x=time (lunch,dinner) grouped by gender): http://www.cookbook-r.com/Graphs/Bar_and_line_graphs_%28ggplot2%29/

Is this possible with my data set at all? If so, what am I doing wrong with the code?


回答1:


Here I tried to create a data frame based on limited sample from your data.

df1 <- data.frame(trcond=rep(c('L', 'N'), 3), 
                  subtitle=rep('New Scene_29', 6),  # Not in use, just a dummy
                  pos=c('lex', 'lex', 'lex', 'noLex', 'noLex', 'noLex'), 
                  totacc=c(0.250, 0.5, 0.875, 0.666, 1.000, 0.833))

Because trcond by pos is not balanced in this data frame, the plot is going to be jumbled up like this:

ggplot(data=df1, aes(x=pos, y=totacc, group=trcond, color=trcond))+ 
  geom_line() + 
  geom_point()

However, if you apply a summary function which will compute means for each condition, a correct plot will appear:
ggplot(data=df1, aes(x=pos, y=totacc, group=trcond, color=trcond))+ 
  geom_line(stat='summary', fun.y='mean') + 
  geom_point(stat='summary', fun.y='mean')

Again, this is trying to figure out what's in your data. The best is that you provide here a sample of your data using dput(head(df1, 50)) to give you a better answer.

来源:https://stackoverflow.com/questions/29082083/line-graph-with-2-categorical-variables-and-1-continuous-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!