Melt a array and make numeric values character

…衆ロ難τιáo~ 提交于 2019-12-11 05:47:56

问题


I have a array and I want to melt it based on the dimnames. The problem is that the dimension names are large numeric values and therefore making them character would convert them to a wrong ID see the example:

test <- array(1:18, dim = c(3,3,2), dimnames = list(c(00901291282245454545454,329293929929292,2929992929922929),
                                                   c("a", "b", "c"),
                                                   c("d", "e")))

library(reshape2)
library(data.table)
test2 <- data.table(melt(test))
test2[, Var1 := as.character(Var1)]

> test2
Var1 Var2 Var3 value
1: 9.01291282245455e+20    a    d     1
2:      329293929929292    a    d     2
3:     2929992929922929    a    d     3
4: 9.01291282245455e+20    b    d     4
5:      329293929929292    b    d     5
6:     2929992929922929    b    d     6
7: 9.01291282245455e+20    c    d     7
8:      329293929929292    c    d     8
9:     2929992929922929    c    d     9
10: 9.01291282245455e+20    a    e    10
11:      329293929929292    a    e    11
12:     2929992929922929    a    e    12
13: 9.01291282245455e+20    b    e    13
14:      329293929929292    b    e    14
15:     2929992929922929    b    e    15
16: 9.01291282245455e+20    c    e    16
17:      329293929929292    c    e    17
18:     2929992929922929    c    e    18

How could I make the first column with the large IDs character? What I am currently doing is pasting a character letter to the dimnames and then melt, making it a character and then take a substring, which is really inefficient. It is important that it is an efficient solution because the dataset is millions of rows. There are two problems,first the 0's are deleted if they are in front of the ID and it is converted to a e+20 character.


回答1:


You need to define your dimnames as character and then slighly modify melt.array which is called when you do melt on your array:

test <- array(1:18, dim = c(3,3,2), dimnames = list(c("00901291282245454545454", "329293929929292", "2929992929922929"),
                                                    c("a", "b", "c"),
                                                    c("d", "e")))

Customise melt.array to add a parameter which permits to decide wether you want the conversion or not:

melt.array2 <- function (data, varnames = names(dimnames(data)), conv=TRUE, ...) 
{
    values <- as.vector(data)
    dn <- dimnames(data)
    if (is.null(dn)) 
        dn <- vector("list", length(dim(data)))
    dn_missing <- sapply(dn, is.null)
    dn[dn_missing] <- lapply(dim(data), function(x) 1:x)[dn_missing]
    if(conv){ # conv is the new parameter to know if conversion needs to be done
        char <- sapply(dn, is.character)
        dn[char] <- lapply(dn[char], type.convert)
    }
    indices <- do.call(expand.grid, dn)
    names(indices) <- varnames
    data.frame(indices, value = values)
}

Try the new function on your example (with conv=FALSE):

head(melt.array2(test, conv=FALSE))
                        # X1 X2 X3 value
# 1  00901291282245454545454  a  d     1
# 2          329293929929292  a  d     2
# 3         2929992929922929  a  d     3
# 4  00901291282245454545454  b  d     4
# 5          329293929929292  b  d     5
# 6         2929992929922929  b  d     6

EDIT

In the development version of reshape2 (devtools::install_github("hadley/reshape"), melt.array is differently defined and you can use parameter as.is to avoid the conversion:

melt(test, as.is=TRUE)

will give you the same result as above (with Var1 etc instead of X1 etc).



来源:https://stackoverflow.com/questions/40587796/melt-a-array-and-make-numeric-values-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!