I am still very new to R and recently came across something I am not sure what it means. data.frame
and data.table
have same type? Can an object ha
Data.table and data.frame are different classes, but they are related through inheritance. Data.table inherits from data.frame, and basically expands its capabilities. You can also see that after converting cars to the data.table class:
R> typeof(cars)
[1] "list" # similar to dataframe
R> mode(cars)
[1] "list" # idem
More information here or just google for "inheritance".
It is not clear what you mean by your line "I obviously can't apply functions that apply to data.frames and not data.table".
Many functions work as you would expect, whether applied to a data.frame
or to a data.table
. In particular, if you read the help page to ?data.table
, you would find this specific line in the first paragraph of the description:
Since a
data.table
is adata.frame
, it is compatible with R functions and packages thatonly
acceptdata.frame
.
You can test this out yourself:
library(data.table)
CARS <- data.table(cars)
The following should all give you the same results. They aren't the "data.table" way of doing things, but I've just popped off a few things off the top of my head to show you that many (most?) functions can be used with data.table
the same way that you would use them with data.frame
(but at that point, you miss out on all the great stuff that data.table
has to offer).
with(cars, tapply(dist, speed, FUN = mean))
with(CARS, tapply(dist, speed, FUN = mean))
aggregate(dist ~ speed, cars, as.vector)
aggregate(dist ~ speed, CARS, as.vector)
colSums(cars)
colSums(CARS)
as.matrix(cars)
as.matrix(CARS)
t(cars)
t(CARS)
table(cut(cars$speed, breaks=3), cut(cars$dist, breaks=5))
table(cut(CARS$speed, breaks=3), cut(CARS$dist, breaks=5))
cars[cars$speed == 4, ]
CARS[CARS$speed == 4, ]
However, there are some cases in which this won't work. Compare:
cars[cars$speed == 4, 1]
CARS[CARS$speed == 4, 1]
For a better understanding of that, I recommend reading the FAQs. In particular, a couple of relevant points have been summarized at this question: what you can do with data.frame that you can't in data.table.
If your question is, more generally, "Can an object have more than one class?", then you've seen from your own exploration that, yes, it can. For more about that, you can read this page from Hadley's devtools wiki.
Classes also affect things like how objects are printed and how they interact with other functions.
Consider the rle
function. If you look at the class
, it returns "rle", and if you look at its str
ucture, it shows that it is a list.
> x <- rev(rep(6:10, 1:5))
> y <- rle(x)
> x
[1] 10 10 10 10 10 9 9 9 9 8 8 8 7 7 6
> y
Run Length Encoding
lengths: int [1:5] 5 4 3 2 1
values : int [1:5] 10 9 8 7 6
> class(y)
[1] "rle"
> str(y)
List of 2
$ lengths: int [1:5] 5 4 3 2 1
$ values : int [1:5] 10 9 8 7 6
- attr(*, "class")= chr "rle"
As the length of each list item is the same, you might expect that you can conveniently use data.frame()
to convert it to a data.frame
. Let's try:
> data.frame(y)
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class ""rle"" to a data.frame
> unclass(y)
$lengths
[1] 5 4 3 2 1
$values
[1] 10 9 8 7 6
> data.frame(unclass(y))
lengths values
1 5 10
2 4 9
3 3 8
4 2 7
5 1 6
Or, let's add another class
to the object and try:
> class(y) <- c(class(y), "list")
> y ## Printing is not affected
Run Length Encoding
lengths: int [1:5] 5 4 3 2 1
values : int [1:5] 10 9 8 7 6
> data.frame(y) ## But interaction with other functions is
lengths values
1 5 10
2 4 9
3 3 8
4 2 7
5 1 6