问题
I've come across a strange behavior when playing with some dataframes: when I create two identical dataframes a,b
, then swap their rownames around, they don't come out as identical:
rm(list=ls())
a <- data.frame(a=c(1,2,3),b=c(2,3,4))
b <- a
identical(a,b)
#TRUE
identical(rownames(a),rownames(b))
#TRUE
rownames(b) <- rownames(a)
identical(a,b)
#FALSE
Can anyone reproduce/explain why?
回答1:
This is admittedly a bit confusing. Starting with ?data.frame
we see that:
If row.names was supplied as NULL or no suitable component was found the row names are the integer sequence starting at one (and such row names are considered to be ‘automatic’, and not preserved by as.matrix).
So initially a
and b
each have an attribute called row.names
that are integers:
> str(attributes(a))
List of 3
$ names : chr [1:2] "a" "b"
$ row.names: int [1:3] 1 2 3
$ class : chr "data.frame"
But rownames()
returns a character vector (as does dimnames()
, actually a list of character vectors, called under the hood). So after reassigning the row names you end up with:
> str(attributes(b))
List of 3
$ names : chr [1:2] "a" "b"
$ row.names: chr [1:3] "1" "2" "3"
$ class : chr "data.frame"
来源:https://stackoverflow.com/questions/42515753/why-do-identical-dataframes-become-different-when-changing-rownames-to-the-same