I want to better understand how environments, closures, and frames are related. I understand function closures contain an environment, environments contain a frame and an enclo
UPDATE R-lang defines an environment
as having a frame. I tend to think about frames as stack frames, not as mapping from name to value - but then there is of course the data.frame
which maps column names to vectors (and then some...). I think most of the confusion comes from the fact that the original S-language (and still S-Plus) did not have environment objects, so all "frames" were essentially what environment objects are now, except that they could only exists as part of the call stack.
For instance, in S-Plus the doc for sys.nframe
says "sys.nframe returns the numerical index of the current frame in the list of all frames." ...that sounds an awful lot like stack frames to me... You can read more about stack frames here: http://en.wikipedia.org/wiki/Call_stack#Structure
I expanded some of the explanations below and use the term "stack frame" consistently (I hope).
END UPDATE
I'd explain them like this:
An environment is an object that maps variable names to values. Each mapping is called a binding. The value can be either a real value or a promise. An environment has a parent environment (except for the empty environment). When you look up a symbol in an environment and it isn't found, the parent environments are also searched.
A promise is an unevaluated expression and an environment in which to evaluate the expression. When the promise is evaluated it is replaced with the generated value.
A closure is a function and the environment that the function was defined in. A function like lm
would have the stats namespace environment and a user defined function would have the global environment - but a function f
defined within another function g
would have the local environment for g
as its environment.
A stack frame (or activation record) is what represents the entries on the call stack. Each stack frame has the local environment that the function is executed in, and the function call's expression (so that sys.call
works).
When a function call is executed, a local environment is created with it's parent set to the closure's environment, the arguments are matched against the function's formal arguments and those bindings are added to the local environment (as promises). The unmatched formal arguments are assigned the default values (promises) of the function (if any) and marked as missing. A stack frame is then created with this local environment and the call expression. The stack frame is pushed on the call stack and then the body of the function is evaluated in this local environment.
...so all symbols in the body will be looked up in the local environment (formal arguments and local variables), and if not found in the parent environment (which is the closure enviroment) and the parent's parent environment and so on until found.
Note that the parent stack frame's environment is NOT searched in this case. The parent.frame
, sys.frame
functions gets the environments on the call stack - that is, the caller's environment and the caller's caller's environment etc...
# Here match.fun needs to look in the caller's caller's environment to find what "x" is...
f <- function(FUN) match.fun(FUN)(1:10)
g <- function() { x=sin; y="x"; f(y) }
g() # same as sin(1:10)
# Here we see that the stack frames must also contain the actual call expression
f <- function(...) sys.call()
g <- function(...) f(..., x=42)
g(a=2) # f(..., x = 42)