问题
When creating a tibble,
tbl <- tibble(A=1:5, B=6:10)
the result of
class(tbl)
is
[1] "tbl_df" "tbl" "data.frame"
I'm used to seeing this as I use dplyr quite a bit. But when is an object just a "tbl" (and not a "tbl_df") or vice versa? I'd just like to know a bit more about the difference, if any.
Any documentation would be much appreciated!
回答1:
You can think of a "tibble" as an interface. If an object can respond to all the tibble actions, then you can think of it as a tibble. R doesn't have strong typing.
So tbl
is the generic tibble, and tbl_df
is a specific type of tibble that basically stores it's data in a data.frame.
There are other packages like dtplyr
that allow you to act like a tibble but store your data in a data.table
. For example
library(dtplyr)
ds <- tbl_dt(mtcars)
class(ds)
# [1] "tbl_dt" "tbl" "data.table" "data.frame"
There's also the dbplyr
package which allows you to use a SQL database back end. For example
library(dplyr)
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
copy_to(con, mtcars, "mtcars",temporary = FALSE)
cars_db <- tbl(con, "mtcars")
class(cars_db)
# [1] "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl"
So again we see that this thing generally can act as a tibble, but it has other classes that are there so that it can try to do all it's work in the database engine, rather than manipulating the data in R itself.
So there's not really a "difference" between tbl
and tbl_df
. The latter just says how the tibble is actually being implemented so the behavior can differ (be more optimized).
For more information, you can check out the tibble vignette or the extending tibble vignette
来源:https://stackoverflow.com/questions/51749664/in-the-tidyverse-what-is-the-difference-between-an-object-of-class-tbl-and-t