How do I understand the warnings from rbind?

血红的双手。 提交于 2020-01-14 14:38:54

问题


If I have two data.frames with the same column names, I can use rbind to make a single data frame. However, if I have one is a factor and the other is an int, I get a warning like this:

Warning message: In [<-.factor(*tmp*, ri, value = c(1L, 1L, 0L, 0L, 0L, 1L, 1L, : invalid factor level, NA generated

The following is a simplification of the problem:

t1 <- structure(list(test = structure(c(1L, 1L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L), .Label = c("False", "True"), class = "factor")), .Names = "test", row.names = c(NA, 
-10L), class = "data.frame")
t2 <- structure(list(test = c(1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L
)), .Names = "test", row.names = c(NA, -10L), class = "data.frame")
rbind(t1, t2)

With the single column, this is easy to understand, but when it is part of a dozen or more factors, it can be difficult. What is there about the warning message to tell me which column to look at? Barring that, what is a good technique to understand which column is in error?


回答1:


You could knock up a simple little comparison script using class and mapply, to compare where the rbind will break down due to non-matching data types, e.g.:

one <- data.frame(a=1,b=factor(1))
two <- data.frame(b=2,a=2)

common <- intersect(names(one),names(two))
mapply(function(x,y) class(x)==class(y), one[common], two[common])

#    a     b 
# TRUE FALSE 



回答2:


Based on thelatemail's answer, here is a function to compare two data.frames for rbinding:

mergeCompare <- function(one, two) {
  cat("Distinct items: ", setdiff(names(one),names(two)), setdiff(names(two),names(one)), "\n")
  print("Non-matching items:")
  common <- intersect(names(one),names(two))
  print (mapply(function(x,y) {class(x)!=class(y)}, one[common], two[common]))
}


来源:https://stackoverflow.com/questions/28825314/how-do-i-understand-the-warnings-from-rbind

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!