how to reorder a factor in a dataframe with fct_reorder?

前端 未结 2 1114
花落未央
花落未央 2021-01-23 06:07

Consider the following example

> library(forcats)
> library(dplyr)
> 
> 
> dataframe <- data_frame(var = c(1,1,1,2,3,4),
+                              


        
相关标签:
2条回答
  • 2021-01-23 06:29

    Suppose your dataframe is:

    dataframe <- data_frame(var = c(1,1,1,2,3,4),var2 = c(10,2,0,15,6,5))
    dataframe <- dataframe %>% mutate(myfactor = factor(var))
    dataframe$myfactor
    
    [1] 1 1 1 2 3 4
    Levels: 1 2 3 4
    

    Now if you want to reorder your factor, where the order is given by the output of a certain function fun on a certain vector x then you can use fct_reorder in the following way:

    dataframe$myfactor= fct_reorder(f = dataframe$myfactor,x = dataframe$var2,fun = mean)
    dataframe$myfactor
    [1] 1 1 1 2 3 4
    Levels: 1 4 3 2
    

    mean of dataframe$var2 for each factor will be calculated and sorted in ascending order by default to order the factor.

    0 讨论(0)
  • 2021-01-23 06:39

    To understand fct_reoder, I created a similar but modified data frame.

    > dataframe <- data_frame(var = as.factor(c(1,2,3,2,3,1,4,1,2,3,4)),var2 = c(1,5,4,2,6,2,9,8,7,6,3))
    
    > str(dataframe)
    Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   11 obs. of  2 variables:
     $ var : Factor w/ 4 levels "1","2","3","4": 1 2 3 2 3 1 4 1 2 3 ...
     $ var2: num  1 5 4 2 6 2 9 8 7 6 ...
    

    here we could see that there are 2 columns, having column 1(var) as a factor variable with levels c(1,2,3,4).

    Now, if one wants to reorder the factors on the basis of the sum of their respective values(var2), one can use the fct_reorder function as below.

    In order to get the difference b/w with and without fct_reorder.

    At first, we would sum up the var2 on the basis of their factors(var) without using fct_reorder:

    > dataframe %>% group_by(var) %>% summarise(var2=sum(var2))
    # A tibble: 4 x 2
      var    var2
      <fct> <dbl>
    1 1        11
    2 2        14
    3 3        16
    4 4        12
    

    Here we could see that the result is not ordered on the basis of the sum of var2.

    Now, we would use fct_order to show the difference.

    > dataframe %>% mutate(var=fct_reorder(var,var2,sum)) %>%
    + group_by(var) %>% summarise(var2=sum(var2))
    # A tibble: 4 x 2
      var    var2
      <fct> <dbl>
    1 1        11
    2 4        12
    3 2        14
    4 3        16
    

    This shows that summation is now ordered.

    Likewise, fct_reorder can be used to plot the graphs(boxplot or histogram etc.) in an ordered way

    0 讨论(0)
提交回复
热议问题