Make Frequency Histogram for Factor Variables

后端 未结 6 1417
野趣味
野趣味 2020-12-01 03:15

I am very new to R, so I apologize for such a basic question. I spent an hour googling this issue, but couldn\'t find a solution.

Say I have some categorical data in

相关标签:
6条回答
  • 2020-12-01 03:59

    It seems like you want barplot(prop.table(table(animals))):

    enter image description here

    However, this is not a histogram.

    0 讨论(0)
  • 2020-12-01 04:06

    The reason you are getting the unexpected result is that hist(...) calculates the distribution from a numeric vector. In your code, table(animalFactor) behaves like a numeric vector with three elements: 1, 3, 7. So hist(...) plots the number of 1's (1), the number of 3's (1), and the number of 7's (1). @Roland's solution is the simplest.

    Here's a way to do this using ggplot:

    library(ggplot2)
    ggp <- ggplot(data.frame(animals),aes(x=animals))
    # counts
    ggp + geom_histogram(fill="lightgreen")
    # proportion
    ggp + geom_histogram(fill="lightblue",aes(y=..count../sum(..count..)))
    

    You would get precisely the same result using animalFactor instead of animals in the code above.

    0 讨论(0)
  • 2020-12-01 04:06

    Country is a categorical variable and I want to see how many occurences of country exist in the data set. In other words, how many records/attendees are from each Country

    barplot(summary(df$Country))
    
    0 讨论(0)
  • 2020-12-01 04:14

    You could also use lattice::histogram()

    0 讨论(0)
  • 2020-12-01 04:15

    If you'd like to do this in ggplot, an API change was made to geom_histogram() that leads to an error: https://github.com/hadley/ggplot2/issues/1465

    To get around this, use geom_bar():

    animals <- c("cat", "dog",  "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat", "bird")
    
    library(ggplot2)
    # counts
    ggplot(data.frame(animals), aes(x=animals)) +
      geom_bar()
    

    0 讨论(0)
  • 2020-12-01 04:15

    Data as factor can be used as input to the plot function.

    An answer to a similar question has been given here: https://stat.ethz.ch/pipermail/r-help/2010-December/261873.html

     x=sample(c("Richard", "Minnie", "Albert", "Helen", "Joe", "Kingston"),  
     50, replace=T)
     x=as.factor(x)
     plot(x)
    
    0 讨论(0)
提交回复
热议问题