I\'m trying to develop a deeper understanding of using the dot (".") with dplyr
and using the .data
pronoun with dplyr
. The code
Up front, I think .data
's intent is a little confusing until one also considers its sibling pronoun, .env
.
The dot .
is something that magrittr::%>%
sets up and uses; since dplyr
re-exports it, it's there. And whenever you reference it, it is a real object, so names(.)
, nrow(.)
, etc all work as expected. It does reflect data up to this point in the pipeline.
.data
, on the other hand, is defined within rlang
for the purpose of disambiguating symbol resolution. Along with .env
, it allows you to be perfectly clear on where you want a particular symbol resolved (when ambiguity is expected). From ?.data, I think this is a clarifying contrast:
disp <- 10
mtcars %>% mutate(disp = .data$disp * .env$disp)
mtcars %>% mutate(disp = disp * disp)
However, as stated in the help pages, .data
(and .env
) is just a "pronoun" (we have verbs, so now we have pronouns too), so it is just a pointer to explain to the tidy internals where the symbol should be resolved. It's just a hint of sorts.
So your statement
both
.
and.data
just mean "our result up to this point in the pipeline."
is not correct: .
represents the data up to this point, .data
is just a declarative hint to the internals.
Consider another way of thinking about .data
: let's say we have two functions that completely disambiguate the environment a symbol is referenced against:
get_internally
, this symbol must always reference a column name, it will not reach out to the enclosing environment if the column does not exist; andget_externally
, this symbol must always reference a variable/object in the enclosing environment, it will never match a column.In that case, translating the above examples, one might use
disp <- 10
mtcars %>%
mutate(disp = get_internally(disp) * get_externally(disp))
In that case, it seems more obvious that get_internally
is not a frame, so you can't call names(get_internally)
and expect it to do something meaningful (other than NULL
). It'd be like names(mutate)
.
So don't think of .data
as an object, think of it as a mechanism to disambiguate the environment of the symbol. I think the $
it uses is both terse/easy-to-use and absolutely-misleading: it is not a list
-like or environment
-like object, even if it is being treated as such.
BTW: one can write any S3 method for $
that makes any classed-object look like a frame/environment:
`$.quux` <- function(x, nm) paste0("hello, ", nm, "!")
obj <- structure(0, class = "quux")
obj$r2evans
# [1] "hello, r2evans!"
names(obj)
# NULL
(The presence of a $
accessor does not always mean the object is a frame/env.)