How to create a histogram based on true or false in R?

坚强是说给别人听的谎言 提交于 2019-12-12 03:29:26

问题


What I'm trying to do is create a two histograms in R, based on if an employee at SeaWorld negotiated a salary increase and one for if they did not negotiate a salary increase. Could someone please show me where I went wrong. Any help is appreciated.

Here's an example of the textfile I'm using.

emp   received   negotiated   gender   year
#325  12.5         TRUE         F      2013
#318  5.2          FALSE        F      2013
#217  9.8          FALSE        M      2013
#223  6.8          TRUE         M      2013
#218  2.1          TRUE         F      2006
#601  13.9         FALSE        M      2006
#225  7.8          TRUE         M      2006
#281  8.5          FALSE        F      2006

Here's the code I have so far:

d<-read.csv("employees.txt", header=TRUE, sep="\t")
str(d)

f1 <- mean(d$received)
f2 <- median(d$received)
f3 <- sd(d$recieved)



d$gender <- factor(d$gender, labels=c(1, 2))
pairs(d)

plot(d$received ~ d$gender)
plot(d$received ~ d$year, xlab="year", ylab="recieved")
m <- lm(d$received~d$year)
print(m)
print(f1)
print(f2)
print(f3)
abline(m)
abline(mean(d$received), 0, lty=2)


hist(d$received[d$gender ==1],breaks = 50)
dev.new()
hist(d$received[d$gender ==2],breaks = 50)
dev.new()
#hist(d$year, breaks = 50)
#dev.new()
plot(d$gender, d$received)

回答1:


The # symbols in your data are causing problems for me...

With the # symbol...

d1 <- read.table(text = "
emp   received   negotiated   gender   year
#325  12.5         TRUE         F      2013
#318  5.2          FALSE        F      2013
#217  9.8          FALSE        M      2013
#223  6.8          TRUE         M      2013
#218  2.1          TRUE         F      2006
#601  13.9         FALSE        M      2006
#225  7.8          TRUE         M      2006
#281  8.5          FALSE        F      2006", 
    header = TRUE)

We get an empty data frame...

str(d1)
'data.frame':   0 obs. of  5 variables:
 $ emp       : logi 
 $ received  : logi 
 $ negotiated: logi 
 $ gender    : logi 
 $ year      : logi 

But without the # we get...

d2 <- read.table(text = "
emp   received   negotiated   gender   year
325  12.5         TRUE         F      2013
318  5.2          FALSE        F      2013
217  9.8          FALSE        M      2013
223  6.8          TRUE         M      2013
218  2.1          TRUE         F      2006
601  13.9         FALSE        M      2006
225  7.8          TRUE         M      2006
281  8.5          FALSE        F      2006", 
    header = TRUE)

...the data as expected:

str(d2)
'data.frame':   8 obs. of  5 variables:
 $ emp       : int  325 318 217 223 218 601 225 281
 $ received  : num  12.5 5.2 9.8 6.8 2.1 13.9 7.8 8.5
 $ negotiated: logi  TRUE FALSE FALSE TRUE TRUE FALSE ...
 $ gender    : Factor w/ 2 levels "F","M": 1 1 2 2 1 2 2 1
 $ year      : int  2013 2013 2013 2013 2006 2006 2006 2006

And for your question about how to create a histogram on how much of a raise the employee received based on if the asked for the raise or not:

hist(d$received[d$negotiated == TRUE])
hist(d$received[d$negotiated == FALSE])


来源:https://stackoverflow.com/questions/19886572/how-to-create-a-histogram-based-on-true-or-false-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!