In the tidyverse, what is the difference between an object of class “tbl” and “tbl_df”?

后端 未结 1 1986
再見小時候
再見小時候 2021-01-24 11:30

When creating a tibble,

tbl <- tibble(A=1:5, B=6:10)

the result of

class(tbl)

is

[1] \"tb         


        
1条回答
  •  孤独总比滥情好
    2021-01-24 11:38

    You can think of a "tibble" as an interface. If an object can respond to all the tibble actions, then you can think of it as a tibble. R doesn't have strong typing.

    So tbl is the generic tibble, and tbl_df is a specific type of tibble that basically stores it's data in a data.frame.

    There are other packages like dtplyr that allow you to act like a tibble but store your data in a data.table. For example

    library(dtplyr)
    ds <- tbl_dt(mtcars)
    class(ds)
    # [1] "tbl_dt"     "tbl"        "data.table" "data.frame"
    

    There's also the dbplyr package which allows you to use a SQL database back end. For example

    library(dplyr)
    con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
    copy_to(con, mtcars, "mtcars",temporary = FALSE)
    cars_db <- tbl(con, "mtcars")
    class(cars_db)
    # [1] "tbl_dbi"  "tbl_sql"  "tbl_lazy" "tbl"  
    

    So again we see that this thing generally can act as a tibble, but it has other classes that are there so that it can try to do all it's work in the database engine, rather than manipulating the data in R itself.

    So there's not really a "difference" between tbl and tbl_df. The latter just says how the tibble is actually being implemented so the behavior can differ (be more optimized).

    For more information, you can check out the tibble vignette or the extending tibble vignette

    0 讨论(0)
提交回复
热议问题