问题
I'm creating a shiny app
in which the user will upload a .csv file that contains several variables. Using dplyr
, I will select
the first four variables, shown below, and convert them from long format.
DATA
df <- read.table(text = c("
Customer Rate Factor Power
W1 6 TK1 5
W2 3 TK1 0
W3 1 TK1 0
W4 2 TK1 0
W5 4 TK1 0
W6 8 TK1 0
W7 5 TK1 0
W8 7 TK1 3
W1 6 TK2 0
W2 3 TK2 1
W3 1 TK2 0
W4 2 TK2 5
W5 4 TK2 0
W6 8 TK2 0
W7 5 TK2 0
W8 7 TK2 3
W1 6 TK3 0
W2 3 TK3 5
W3 1 TK3 1
W4 2 TK3 0
W5 4 TK3 0
W6 8 TK3 0
W7 5 TK3 0
W8 7 TK3 0
W1 6 TK4 0
W2 3 TK4 3
W3 1 TK4 0
W4 2 TK4 0
W5 4 TK4 0
W6 8 TK4 0
W7 5 TK4 0
W8 7 TK4 0
W1 6 TK5 1
W2 3 TK5 0
W3 1 TK5 5
W4 2 TK5 0
W5 4 TK5 1
W6 8 TK5 0
W7 5 TK5 0
W8 7 TK5 0
W1 6 TK6 0
W2 3 TK6 0
W3 1 TK6 0
W4 2 TK6 0
W5 4 TK6 0
W6 8 TK6 0
W7 5 TK6 5
W8 7 TK6 0
W1 6 TK7 0
W2 3 TK7 0
W3 1 TK7 0
W4 2 TK7 0
W5 4 TK7 0
W6 8 TK7 3
W7 5 TK7 3
W8 7 TK7 0
W1 6 TK8 0
W2 3 TK8 0
W3 1 TK8 1
W4 2 TK8 0
W5 4 TK8 0
W6 8 TK8 3
W7 5 TK8 0
W8 7 TK8 0
W1 6 TK9 0
W2 3 TK9 0
W3 1 TK9 0
W4 2 TK9 0
W5 4 TK9 5
W6 8 TK9 0
W7 5 TK9 0
W8 7 TK9 0
W1 6 TK10 0
W2 3 TK10 0
W3 1 TK10 0
W4 2 TK10 0
W5 4 TK10 0
W6 8 TK10 5
W7 5 TK10 0
W8 7 TK10 0
W1 6 TK11 0
W2 3 TK11 0
W3 1 TK11 0
W4 2 TK11 0
W5 4 TK11 0
W6 8 TK11 0
W7 5 TK11 0
W8 7 TK11 3
W1 6 TK12 0
W2 3 TK12 0
W3 1 TK12 0
W4 2 TK12 0
W5 4 TK12 0
W6 8 TK12 0
W7 5 TK12 0
W8 7 TK12 5"), header = T)
I used the code below to convert from long to wide format
LONG TO WIDE
library(dplyr)
library(tidyr)
df_wide <- df %>%
tidyr::spread(Factor, Power)
RESULT
> df_wide
Customer Rate TK1 TK10 TK11 TK12 TK2 TK3 TK4 TK5 TK6 TK7 TK8 TK9
1 W1 6 5 0 0 0 0 0 0 1 0 0 0 0
2 W2 3 0 0 0 0 1 5 3 0 0 0 0 0
3 W3 1 0 0 0 0 0 1 0 5 0 0 1 0
4 W4 2 0 0 0 0 5 0 0 0 0 0 0 0
5 W5 4 0 0 0 0 0 0 0 1 0 0 0 5
6 W6 8 0 5 0 0 0 0 0 0 0 3 3 0
7 W7 5 0 0 0 0 0 0 0 0 5 3 0 0
8 W8 7 3 0 3 5 3 0 0 0 0 0 0 0
The wide format is showing the levels of Factor
variable as TK1
and then TK10
> levels(df$Factor)
[1] "TK1" "TK10" "TK11" "TK12" "TK2" "TK3" "TK4" "TK5" "TK6" "TK7" "TK8" "TK9"
I want the levels of Factor to be from TK1, TK2 till TK12
I can solve as below
df$Factor <- factor(df$Factor, levels = c("TK1", "TK2" , "TK3" , "TK4", "TK5" , "TK6" , "TK7" , "TK8" , "TK9", "TK10", "TK11", "TK12"))
However, the levels of Factor variable will be a function of the user's input. It might be 14, 15 or 20.
QUESTION
Is there any way to arrange the levels of Factor column from lowest to highest regardless of the user's input?
回答1:
We can change it to factor
with levels
specified
df %>%
mutate(Factor = factor(Factor, levels = paste0("TK", 1:12))) %>%
spread(Factor, Power)
Or make it more dynamic, we extract
the non-numeric and numeric part into separate columns ('Factor1', 'Factor2'), change the 'Factor' to factor
with levels
specified by paste
ing the sequence of min
to max
values in 'Factor2' with that of the first character value in 'Factor1', remove the 'Factor1' and 'Factor2', and spread
.
library(tidyr)
res <- df %>%
extract(Factor, into = c("Factor1", "Factor2"), "(\\D+)(\\d+)",
remove = FALSE, convert=TRUE) %>%
mutate(Factor = factor(Factor, levels = paste0(Factor1[1],
min(Factor2):max(Factor2)))) %>%
select(-Factor1, -Factor2) %>%
spread(Factor, Power)
head(res, 2)
# Customer Rate TK1 TK2 TK3 TK4 TK5 TK6 TK7 TK8 TK9 TK10 TK11 TK12
#1 W1 6 5 0 0 0 1 0 0 0 0 0 0 0
#2 W2 3 0 1 5 3 0 0 0 0 0 0 0 0
来源:https://stackoverflow.com/questions/38194583/dplyr-and-tidyr-convert-long-to-wide-format-and-arrange-columns