Does R copy unevaluated slots in S4 classes on assignment?

浪尽此生 提交于 2019-12-07 05:49:45

问题


Suppose I have an S4 class with two slots. I then create a method that sets one of the slots to something and returns the result. Will the other slot also be copied on assignment?

For example,

setClass('foo', representation(first.slot = 'numeric', second.slot = 'numeric'))
setGeneric('setFirstSlot', function(object, value) {standardGeneric('setFirstSlot')})
setMethod('setFirstSlot', signature('foo', 'numeric'), function(object, value) {
 object@first.slot = value
 return(object)
 }) 

 f <- new('foo')
 f@second.slot <- 2
 f <- setFirstSlot(f, 1)

On the last line, will the values of both the first and second slot be copied or will there be some sort of optimization? I have a class with a field holding a gigabyte of data and a few fields with small numeric vectors, I'd like to have a setter function for the numeric fields that doesn't waste time needlessly copying the data every time it's used.

Thanks :)


回答1:


If you are copying large amounts of data in a field, one solution is to use a reference class. Let's compare the reference classes to S4.

## Store timing output
m = matrix(0, ncol=4, nrow=6)

Create a function class definition:

foo_ref = setRefClass("test", fields = list(x = "numeric", y = "numeric"))

Then time data assignment:

## Reference class
g = function(x) {x$x[1] = 1; return(x)}
for(i in 6:8){
  f = foo_ref$new(x = 1, y = 1)
  y = runif(10^i)
  t1 = system.time({f$y <- y})[3]
  t2 = system.time({f$y[1] = 1})[3]
  t3 = system.time({f$x = 1})[3]
  t4 = system.time({g(f)})[3]
  m[i-5, ] = c(t1, t2, t3, t4)
}

We can repeat for a similar S4 structure:

g = function(x) {x@y[1] = 1; return(x)}
setClass('foo_s4', representation(x = 'numeric', y = 'numeric'))
for(i in 6:8){
  f = new('foo_s4'); f@x = 1; f@y = 1
  y = runif(10^i)
  t1 = system.time({f@y <- y})[3]
  t2 = system.time({f@y[1] <- 1})[3]
  t3 = system.time({f@x = 1})[3]
  t4 = system.time({g(f)})[3]
  m[i-2, ] = c(t1, t2, t3, t4)
}

Results

Assignment using a reference class structure for large data sets is much more efficient when dealing with functions.

Notes

  • Results for R version 3.1
  • For R < 3.1, t3 timings for S4 objects were higher.



回答2:


When the class is used by the developer (who knows the design of the class), using the assignment operator @<- instead of a setter method as setFirstSlot defined in the question may be better. The reason is that the former avoids returning the whole object.

However, setter methods are desirable to prevent users from trying assignments that do not match the definition of the slot in the class. I know that if we use @<- to assign a character to the slot x (which was defined as numeric) an error is returned.

setClass('foo', representation(x = 'numeric', y = 'numeric'))
f <- new('foo')
f@x <- 1 # this is ok
f@y <- 2 # this is ok
f@x <- "a"
#Error in checkAtAssignment("foo", "x", "character") : 
#  assignment of an object of class “character” is not valid for @‘x’ in an object of class “foo”; is(value, "numeric") is not TRUE

But imagine a situation where the slot should contain only one element. This requirement in the length of the slot is not caught by @<-:

# this assignment is allowed
f@x <- c(1, 2, 3, 4)
f@x
#[1] 1 2 3 4

In this situation we would like to define a setter method in order to inform the user about further restrictions in the definition of the slot. But then, we have to return the entire object and this may be an extra burden if the object is big.

As far as I know there is no way to define the length of a slot in its definition. The method setValidity could be defined in order to check this or other requirements in the slots, but it seems that @<- does not rely on validObject and the assignment f@x <- c(1, 2, 3, 4) would be allowed even if we define setValidity:

valid.foo <- function(object)
{
  if (length(object@x) > 1)
    stop("slot ", sQuote("x"), " must be of length 1")
}
setValidity("foo", valid.foo)
# no error is detected and the assignment is allowed
f@x <- c(1, 4, 6)
f@x
#[1] 1 4 6
# we need to call "validObject" to check if everything is correct
validObject(f)
#Error in validityMethod(object) : slot ‘x’ must be of length 1

A possible solution is to modify the object in-place. The method set.x.inplace below is based on this approach.

setGeneric("set.x.inplace", function(object, val){ standardGeneric("set.x.inplace") })
setMethod("set.x.inplace", "foo", function(object, val)
{
  if (length(val) == 1) {
    eval(eval(substitute(expression(object@x <<- val))))
  } else
    stop("slot ", sQuote("x"), " must be of length 1")
  #return(object) # not necessary
})

set.x.inplace(f, 6)
f
#An object of class "foo"
#Slot "x":
#[1] 6
#Slot "y":
#[1] 2

# the assignment is not allowed
set.x.inplace(f, c(1,2,3))
#Error in set.x.inplace(f, c(1, 2, 3)) : slot ‘x’ must be of length 1

As this method does not perform a return operation, it can be a good alternative with objects of large size.



来源:https://stackoverflow.com/questions/22448198/does-r-copy-unevaluated-slots-in-s4-classes-on-assignment

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!