reshape2 melt warning message

你说的曾经没有我的故事 提交于 2019-11-28 16:58:33

问题


I'm using melt and encounter the following warning message:
attributes are not identical across measure variables; they will be dropped

After looking around people have mentioned it is because the variables are different classes; however, that is not the case with my dataset.

Here is the dataset:

test <- structure(list(park = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("miss", "piro", "sacn", "slbe"), class = "factor"), 
    a1.one = structure(c(3L, 1L, 3L, 3L, 3L, 3L, 1L, 3L, 3L, 
    3L), .Label = c("agriculture", "beaver", "development", "flooding", 
    "forest_pathogen", "harvest_00_20", "harvest_30_60", "harvest_70_90", 
    "none"), class = "factor"), a2.one = structure(c(6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), .Label = c("development", 
    "forest_pathogen", "harvest_00_20", "harvest_30_60", "harvest_70_90", 
    "none"), class = "factor"), a3.one = structure(c(3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("forest_pathogen", 
    "harvest_00_20", "none"), class = "factor"), a1.two = structure(c(3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("agriculture", 
    "beaver", "development", "flooding", "forest_pathogen", "harvest_00_20", 
    "harvest_30_60", "harvest_70_90", "none"), class = "factor"), 
    a2.two = structure(c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L), .Label = c("development", "forest_pathogen", "harvest_00_20", 
    "harvest_30_60", "harvest_70_90", "none"), class = "factor"), 
    a3.two = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L), .Label = c("forest_pathogen", "harvest_00_20", "none"
    ), class = "factor")), .Names = c("park", "a1.one", "a2.one", 
"a3.one", "a1.two", "a2.two", "a3.two"), row.names = c(NA, 10L
), class = "data.frame")

And here is the structure:

str(test)
'data.frame':   10 obs. of  7 variables:
 $ park  : Factor w/ 4 levels "miss","piro",..: 1 1 1 1 1 1 1 1 1 1
 $ a1.one: Factor w/ 9 levels "agriculture",..: 3 1 3 3 3 3 1 3 3 3
 $ a2.one: Factor w/ 6 levels "development",..: 6 6 6 6 6 6 6 6 6 6
 $ a3.one: Factor w/ 3 levels "forest_pathogen",..: 3 3 3 3 3 3 3 3 3 3
 $ a1.two: Factor w/ 9 levels "agriculture",..: 3 3 3 3 3 3 3 3 3 3
 $ a2.two: Factor w/ 6 levels "development",..: 6 6 6 6 6 6 6 6 6 6
 $ a3.two: Factor w/ 3 levels "forest_pathogen",..: 3 3 3 3 3 3 3 3 3 3

Is it because the number of levels are different for each variable? So, can I just ignore the warning message in this case?

To generate the warning message:

library(reshape2)
test.m <- melt (test,id.vars=c('park'))
Warning message:
attributes are not identical across measure variables; they will be dropped

Thanks.


回答1:


An explanation:

When you melt, you are combining multiple columns into one. In this case, you are combining factor columns, each of which has a levels attribute. These levels are not the same across columns because your factors are actually different. melt just coerces each factor to character and drops their attributes when creating the value column in the result.

In this case the warning doesn't matter, but you need to be very careful when combining columns that are not of the same "type", where "type" does not mean just vector type, but generically the nature of things it refers to. For example, I would not want to melt a column containing speeds in MPH with one containing weights in LBs.

One way to confirm that it is okay to combine your factor columns is to ask yourself whether any possible value in one column would be a reasonable value to have in every other column. If that is the case, then likely the correct thing to do would be to ensure that every factor column has all the possible levels that it could accept (in the same order). If you do this, you will not get a warning when you melt the table.

An illustration:

library(reshape2)
DF <- data.frame(id=1:3, x=letters[1:3], y=rev(letters)[1:3])
str(DF)

The levels for x and y are not the same:

'data.frame':  3 obs. of  3 variables:
$ id: int  1 2 3
$ x : Factor w/ 3 levels "a","b","c": 1 2 3
$ y : Factor w/ 3 levels "x","y","z": 3 2 1

Here we melt and look at the column x and y were molten into (value):

melt(DF, id.vars="id")$value

We get a character vector and a warning:

[1] "a" "b" "c" "z" "y" "x"
Warning message:
attributes are not identical across measure variables; they will be dropped 

If however we reset the factors to have the same levels and only then melt:

DF[2:3] <- lapply(DF[2:3], factor, levels=letters)
melt(DF, id.vars="id", factorsAsStrings=F)$value

We get the correct factor and no warnings:

[1] a b c z y x
Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z

The default behavior of melt is to drop factor levels even when they are identical, which is why we use factorsAsStrings=F above. If you had not used that setting you would have gotten a character vector, but no warning. I would argue the default behavior should be to keep the result as a factor, but that is not the case here.




回答2:


BrodieG's answer is excellent; however there are some cases where it is impractical to refactor columns (for example GHCN climate data with 128 fixed-width columns that I wanted to melt into a much smaller number of columns).

In that case, the simplest solution is to treat the data as characters rather than factors: for example, you can re-import the data using read.fwf(filename,stringsAsFactors=FALSE) (the same idea would work for read.csv). For a smaller number of columns you could convert factors to strings using d$mystring<-as.character(d$myfactor).



来源:https://stackoverflow.com/questions/25688897/reshape2-melt-warning-message

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!