Why do identical dataframes become different when changing rownames to the same

问题

I've come across a strange behavior when playing with some dataframes: when I create two identical dataframes a,b, then swap their rownames around, they don't come out as identical:

rm(list=ls())

a <- data.frame(a=c(1,2,3),b=c(2,3,4))
b <- a
identical(a,b)
#TRUE

identical(rownames(a),rownames(b))
#TRUE

rownames(b) <- rownames(a)

identical(a,b)
#FALSE

Can anyone reproduce/explain why?

回答1:

This is admittedly a bit confusing. Starting with ?data.frame we see that:

If row.names was supplied as NULL or no suitable component was found the row names are the integer sequence starting at one (and such row names are considered to be ‘automatic’, and not preserved by as.matrix).

So initially a and b each have an attribute called row.names that are integers:

> str(attributes(a))
List of 3
 $ names    : chr [1:2] "a" "b"
 $ row.names: int [1:3] 1 2 3
 $ class    : chr "data.frame"

But rownames() returns a character vector (as does dimnames(), actually a list of character vectors, called under the hood). So after reassigning the row names you end up with:

> str(attributes(b))
List of 3
 $ names    : chr [1:2] "a" "b"
 $ row.names: chr [1:3] "1" "2" "3"
 $ class    : chr "data.frame"

来源：https://stackoverflow.com/questions/42515753/why-do-identical-dataframes-become-different-when-changing-rownames-to-the-same

标签

dataframe

rowname

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!