问题
I remember reading somewhere that as.tibble()
is an alias for as_data_frame()
, but I don't know what exactly an alias is in programming terminology. Is it similar to a wrapper?
So I guess my question probably comes down to the difference in possible usages between tbl_df()
and as_data_frame()
: what are the differences between them, if any?
More specifically, given a (non-tibble) data frame df
, I often turn it into a tibble by using:
df <- tbl_df(df)
Wouldn't
df <- as_data_frame(df)
do the same thing? If so, are there other cases where the two functions tbl_df()
and as_data_frame()
can not be used interchangeably to get the same result?
The R documentation says that
tbl_df()
forwards the argument toas_data_frame()
does that mean that tbl_df()
is a wrapper or alias for as_data_frame()
? R documentation doesn't seem to say anything about as.tibble()
and I forgot where I read that it was an alias for as_data_frame()
. Also, apparently as_tibble()
is another alias for as_data_frame()
.
If these four functions really are all the same function, what is the sense in giving one function four different names? Isn't that more confusing than helpful?
回答1:
To answer your question of "whether it is confusing", I think so :) .
as.tibble
and as_tibble
are the same; both simply call the S3 method as_tibble
:
> as.tibble
function (x, ...)
{
UseMethod("as_tibble")
}
<environment: namespace:tibble>
as_data_frame
and tbl_df
are not exactly the same; tbl_df
calls as_data_frame
:
> tbl_df
function (data)
{
as_data_frame(data)
}
<environment: namespace:dplyr>
Note tbl_df
is in dplyr
while as_data_frame
is in the tibble
package:
> as_data_frame
function (x, ...)
{
UseMethod("as_data_frame")
}
<environment: namespace:tibble>
but of course it calls the same function, so they are "the same", or aliases as you say.
Now, we can look at the differences between the generic methods as_tibble
and as_data_frame
. First, we look at the methods of each:
> methods(as_tibble)
[1] as_tibble.data.frame* as_tibble.default* as_tibble.list* as_tibble.matrix* as_tibble.NULL*
[6] as_tibble.poly* as_tibble.table* as_tibble.tbl_df* as_tibble.ts*
see '?methods' for accessing help and source code
> methods(as_data_frame)
[1] as_data_frame.data.frame* as_data_frame.default* as_data_frame.grouped_df* as_data_frame.list*
[5] as_data_frame.matrix* as_data_frame.NULL* as_data_frame.table* as_data_frame.tbl_cube*
[9] as_data_frame.tbl_df*
see '?methods' for accessing help and source code
If you check out the code for as_tibble
, you can see that the definitions for many of the as_data_frame
methods as well. as_tibble
defines two additional methods which aren't defined for as_data_frame
, as_tibble.ts
and as_tibble.poly
. I'm not really sure why they couldn't be also defined for as_data_frame
.
as_data_frame
has two additional methods, which are both defined in dplyr
: as_data_frame.tbl_cube
and as_data_frame.grouped_df
.
as_data_frame.tbl_cube
use the weaker checking of as.data.frame
(yes, bear with me) to then call as_data_frame
:
> getAnywhere(as_data_frame.tbl_cube)
function (x, ...)
{
as_data_frame(as.data.frame(x, ..., stringsAsFactors = FALSE))
}
<environment: namespace:dplyr>
while as_data_frame.grouped_df
ungroups the passed dataframe.
Overall, it seems that as_data_frame
should be seen as providing additional functionality over as_tibble
, unless you are dealing with ts
or poly
objects.
回答2:
According to the introduction to tibble, it seems like tibbles supersede tbl_df
.
I’m pleased to announce tibble, a new package for manipulating and printing data frames in R. Tibbles are a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not. The name comes from dplyr: originally you created these objects with
tbl_df()
, which was most easily pronounced as “tibble diff”.[...]This package extracts out the tbl_df class associated functions from dplyr.
To add to the confusion, tbl_df
now calls as_tibble
, which is the preferred alias for as_data_frame
and as.tibble
: (Hadley Wickham's comment on the issue, and as_tibble docs)
> tbl_df
function (data)
{
as_tibble(data, .name_repair = "check_unique")
}
According to the help description of tbl_df()
, it is deprecated and tibble::as_tibble()
should be used instead. as_data_frame
and as.tibble
help pages both redirect to as_tibble
.
When calling class
on a tibble, the class name still shows up as tbl_df
:
> as_tibble(mtcars) %>% class
[1] "tbl_df" "tbl" "data.frame"
来源:https://stackoverflow.com/questions/43942328/what-is-the-difference-between-as-tibble-as-data-frame-and-tbl-df