rbind data.frames without names

霸气de小男生 提交于 2019-12-10 12:56:32

问题


I am trying to figure out why the rbind function is not working as intended when joining data.frames without names. Here is my testing:

test <- data.frame(
            id=rep(c("a","b"),each=3),
            time=rep(1:3,2),
            black=1:6,
            white=1:6,
            stringsAsFactors=FALSE
            )

# take some subsets with different names
pt1 <- test[,c(1,2,3)]
pt2 <- test[,c(1,2,4)]

# method 1 - rename to same names - works
names(pt2) <- names(pt1)
rbind(pt1,pt2)

# method 2 - works - even with duplicate names
names(pt1) <- letters[c(1,1,1)]
names(pt2) <- letters[c(1,1,1)]
rbind(pt1,pt2)

# method 3 - works  - with a vector of NA's as names
names(pt1) <- rep(NA,ncol(pt1))
names(pt2) <- rep(NA,ncol(pt2))
rbind(pt1,pt2)

# method 4 - but... does not work without names at all?
pt1 <- unname(pt1)
pt2 <- unname(pt2)
rbind(pt1,pt2)

This seems a bit odd to me. Am I missing a good reason why this shouldn't work out of the box?

edit for additional info

Using @JoshO'Brien's suggestion to debug, I can identify the error as occurring during this if statement part of the rbind.data.frame function

if (is.null(pi) || is.na(jj <- pi[[j]]))

(online version of code here: http://svn.r-project.org/R/trunk/src/library/base/R/dataframe.R starting at: "### Here are the methods for rbind and cbind.")

From stepping through the program, the value of pi does not appear to have been set at this point, hence the program tries to index the built-in constant pi like pi[[3]] and errors out.

From what I can figure, the internal pi object doesn't appear to be set due to this earlier line where clabs has been initialized as NULL:

if (is.null(clabs)) clabs <- names(xi) else { #pi gets set here

I am in a tangle trying to figure this out, but will update as it comes together.


回答1:


Because unname() & explicitly assigning NA as column headers are not identical actions. When the column names are all NA, then an rbind() is possible. Since rbind() takes the names/colnames of the data frame, the results do not match & hence rbind() fails.

Here is some code to help see what I mean:

> c1 <- c(1,2,3)
> c2 <- c('A','B','C')
> df1 <- data.frame(c1,c2)
> df1
  c1 c2
1  1  A
2  2  B
3  3  C
> df2 <- data.frame(c1,c2) # df1 & df2 are identical
>
> #Let's perform unname on one data frame &
> #replacement with NA on the other
>
> unname(df1)
  NA NA
1  1  A
2  2  B
3  3  C
> tem1 <- names(unname(df1))
> tem1
NULL
>
> #Please note above that the column headers though showing as NA are null
>
> names(df2) <- rep(NA,ncol(df2))
> df2
  NA NA
1  1  A
2  2  B
3  3  C
> tem2 <- names(df2)
> tem2
[1] NA NA
> 
> #Though unname(df1) & df2 look identical, they aren't
> #Also note difference in tem1 & tem2
>
> identical(unname(df1),df2)
[1] FALSE
> 

I hope this helps. The names show up as NA each, but the two operations are different.

Hence, two data frames with their column headers replaced to NA can be "rbound" but two data frames without any column headers (achieved using unname()) cannot.



来源:https://stackoverflow.com/questions/13599197/rbind-data-frames-without-names

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!