Saving a single object within a function in R: RData file size is very large

前端未结

关注

 2  647

I am trying to save trimmed-down GLM objects in R (i.e. with all the \"non-essential\" characteristics set to NULL e.g. residuals, prior.weights, qr$qr).

As an example,

相关标签:

2条回答

無奈伤痛

2021-02-10 01:41
Do you find that you have the same problem when you name the arguments in your call to save?

I used:
```
subFn <- function(y, x){
             glmObject <- glm(y ~ x, family = "binomial")
             save(list = "glmObject", file = "FileName.RData")
}

mainFn <- function(y, x){ 
         subFn(y, x)
}

mainFn(y = rbinom(n = 10, size = 1, prob = 1 / 2), x = 1:10)
```
I saw that the file "FileName.RData" was created in my working directory. It is 6.6 kb in size.

I then use:
```
load("FileName.RData")
```
to load the contents, glmObject, to my global environment.
0 讨论(0)
发布评论:

提交评论
- 加载中...
心在旅途

2021-02-10 01:48
Formulas have an environment attached. If that's the global environment or a package environment, it's not saved, but if it's not one that can be reconstructed, it will be saved.

glm results typically contain formulas, so they can contain the environment attached to that formula.

You don't need glm to demonstrate this. Just try this:
```
formula1 <- y ~ x
save(formula1, file = "formula1.Rdata")

f <- function() {
   z <- rnorm(1000000)
   formula2 <- y ~ x
   save(formula2, file = "formula2.Rdata")
}
f()
```
When I run the code above, formula1.Rdata ends up at 114 bytes, while formula2.Rdata ends up at 7.7 MB. This is because the latter captures the environment it was created in, and that contains the big vector z.

To avoid this, clean up the environment where you created a formula before saving the formula. Don't delete things that the formula refers to (because glm may need those), but do delete irrelevant things (like z in my example). See:
```
g <- function() {
   z <- rnorm(1000000)
   formula3 <- y ~ x
   rm(z)
   save(formula3, file = "formula3.Rdata")
}
g()
```
This gives formula3.Rdata of 144 bytes.
0 讨论(0)
发布评论:

提交评论
- 加载中...