I am using the patterns()
argument in data.table::melt()
to melt data that has columns that have several easily-defined patterns. It is working, but I'm not seeing how I can create a character index variable instead of the default numeric breakdown.
For example, in A the dog and cat columns are numbered... take a look at at the "variable" column:
A = data.table(idcol = c(1:5),
dog_1 = c(1:5), cat_1 = c(101:105),
dog_2 = c(6:10), cat_2 = c(106:110),
dog_3 = c(11:15), cat_3 = c(111:115))
head(melt(A, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat")))
idcol variable dog cat
1: 1 1 1 101
2: 2 1 2 102
3: 3 1 3 103
4: 4 1 4 104
5: 5 1 5 105
6: 1 2 6 106
However, in B the dog and cat columns are numbered with text, but the "variable" column is still numeric.
B = data.table(idcol = c(1:5),
dog_one = c(1:5), cat_one = c(101:105),
dog_two = c(6:10), cat_two = c(106:110),
dog_three = c(11:15), cat_three = c(111:115))
head(melt(B, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat")))
idcol variable dog cat
1: 1 1 1 101
2: 2 1 2 102
3: 3 1 3 103
4: 4 1 4 104
5: 5 1 5 105
6: 1 2 6 106
How can I fill the "variable" column with the one/two/three instead of 1/2/3?
There might be easier ways, but this seems to work:
# grab suffixes of 'variable' names
suff <- unique(sub('^.*_', '', names(B[ , -1])))
# suff <- unique(tstrsplit(names(B[, -1]), "_")[[2]])
# melt
B2 <- melt(B, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat"))
# replace factor levels in 'variable' with the suffixes
setattr(B2$variable, "levels", suff)
B2
# idcol variable dog cat
# 1: 1 one 1 101
# 2: 2 one 2 102
# 3: 3 one 3 103
# 4: 4 one 4 104
# 5: 5 one 5 105
# 6: 1 two 6 106
# 7: 2 two 7 107
# 8: 3 two 8 108
# 9: 4 two 9 109
# 10: 5 two 10 110
# 11: 1 three 11 111
# 12: 2 three 12 112
# 13: 3 three 13 113
# 14: 4 three 14 114
# 15: 5 three 15 115
Note that there is an open issue on this topic with some other alternatives: FR: expansion of melt functionality for handling names of output.
This is one of the (rare) instances where I believe good'ol base::reshape
is cleaner. Its sep
argument comes in handy here — both the names of the 'value' column and the levels of the 'variable' columns are generated in one go:
reshape(data = B,
varying = names(B[ , -1]),
sep = "_",
direction = "long")
来源:https://stackoverflow.com/questions/41883573/convert-numeric-representation-of-variable-column-to-original-string-following