问题
I have data of this wide format that I want to convert to long format
Cond Construct Line Plant Tube_shoot weight_shoot Tube_root weight_root
1 Standard NA NA 2 199.95 - -
2 Cd0 IIF 43.1 1 3 51.87 4 10.39
3 Cd0 IIF 43.1 2 5 81.80 6 15.05
4 Cd0 IIF 43.1 3 7 101.56 8 16.70
What I basically want is to keep Tube_shoot and weight_shoot together, i.e. treat these two columns as one for melt. But because I can only use
id.vars=c("Cond","Construct","Line","Plant")
the result is not quite what I want.
I have two (ugly) solutions so far:
I melt twice, first by measure.vars=c("Tube_shoot", "Tube_root" ), then by weights, and later remove half of the rows where the result is simply wrong. This is not feasible for me because I have different lengths of data, and I'd always have to check if I'm taking the right rows out.
I paste "tube" with "weight" into a new column, take out the others, melt them, and later take them apart again.
Copy them one by one in excel. But with hundreds of lines I'd rather learn how to do it in R.
I'm sure that there is a better way.
What I want in the end:
Cond Construct Line Plant Tube weight
1 Standard NA NA 2 199.95
2 Cd0 IIF 43.1 1 3 51.87
3 Cd0 IIF 43.1 2 5 81.80
4 Cd0 IIF 43.1 3 7 101.56
2 Cd0 IIF 43.1 1 4 10.39
3 Cd0 IIF 43.1 2 6 15.05
4 Cd0 IIF 43.1 3 8 16.70
回答1:
You could try
res <- reshape(df1, idvar=c('Cond', 'Construct', 'Line', 'Plant'),
varying=5:8, direction='long', sep="_")
res1 <- res[res$weight!='-', -5]
row.names(res1) <- NULL
res1
# Cond Construct Line Plant Tube weight_shoot
#1 Standard NA NA 2 199.95
#2 Cd0 IIF 43.1 1 3 51.87
#3 Cd0 IIF 43.1 2 5 81.8
#4 Cd0 IIF 43.1 3 7 101.56
#5 Cd0 IIF 43.1 1 4 10.39
#6 Cd0 IIF 43.1 2 6 15.05
#7 Cd0 IIF 43.1 3 8 16.70
data
df1 <- structure(list(Cond = c("Standard", "Cd0", "Cd0", "Cd0"),
Construct = c("", "IIF", "IIF", "IIF"), Line = c(NA, 43.1, 43.1, 43.1),
Plant = c(NA, 1L, 2L, 3L), Tube_shoot = c(2L, 3L, 5L, 7L), weight_shoot =
c(199.95,51.87, 81.8, 101.56), Tube_root = c("-", "4", "6", "8"),
weight_root = c("-", "10.39", "15.05", "16.70")), .Names = c("Cond",
"Construct", "Line", "Plant", "Tube_shoot", "weight_shoot", "Tube_root",
"weight_root"), class = "data.frame", row.names = c("1", "2", "3", "4"))
回答2:
Another option, using dplyr and tidyr:
library(dplyr)
libarary(tidyr)
gather(df1, x, Tube, c(Tube_shoot, Tube_root)) %>%
mutate(weight = ifelse(grepl("*root$", x), weight_root, weight_shoot)) %>%
select(-c(weight_shoot, weight_root, x))
# Cond Construct Line Plant Tube weight
#1 Standard NA NA 2 199.95
#2 Cd0 IIF 43.1 1 3 51.87
#3 Cd0 IIF 43.1 2 5 81.8
#4 Cd0 IIF 43.1 3 7 101.56
#5 Standard NA NA - -
#6 Cd0 IIF 43.1 1 4 10.39
#7 Cd0 IIF 43.1 2 6 15.05
#8 Cd0 IIF 43.1 3 8 16.70
回答3:
You may want to consider merged.stack
from my "splitstackshape" package, with which you can do something like:
library(splitstackshape)
merged.stack(as.data.table(df1, keep.rownames = TRUE),
var.stubs = c("Tube", "weight"), sep = "_")
# rn Cond Construct Line Plant .time_1 Tube weight
# 1: 1 Standard NA NA root - -
# 2: 1 Standard NA NA shoot 2 199.95
# 3: 2 Cd0 IIF 43.1 1 root 4 10.39
# 4: 2 Cd0 IIF 43.1 1 shoot 3 51.87
# 5: 3 Cd0 IIF 43.1 2 root 6 15.05
# 6: 3 Cd0 IIF 43.1 2 shoot 5 81.8
# 7: 4 Cd0 IIF 43.1 3 root 8 16.70
# 8: 4 Cd0 IIF 43.1 3 shoot 7 101.56
Of course, you can also add a [Tube != "-" | weight != "-"]
to the end to remove the rows where "Tube" or "weight" have "-"... but note that doing so doesn't magically convert those columns to numeric :-)
来源:https://stackoverflow.com/questions/27124501/how-can-i-melt-columns-while-keeping-two-together