Here is some code exploring the additional copying that can result from assigning to a cell in an array (in this case using a for
loop).
# populate
A copy is made because R thinks that there may be another reference to the object:
x <- 1:10
.Internal(inspect(x))
## @5a27838 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
# NAM(1) means that there is one reference to the object.
tracemem(x)
## [1] "<0x05a27838>"
.Internal(inspect(x))
## @5a27838 13 INTSXP g0c4 [NAM(1),TR] (len=10, tl=0) 1,2,3,4,5,...
# Still one reference
mean(x)
## [1] 5.5
.Internal(inspect(x))
## @5a27838 13 INTSXP g0c4 [NAM(2),TR] (len=10, tl=0) 1,2,3,4,5,...
# NAM(2) means "more than one" reference.
# A copy of the "pointer" was taken to pass to "mean", which bumped the count.
# The count starts at (essentially) 1, and is set to 2 if a copy is made. Never back to 1 though.
x[1] <- 0
tracemem[0x05a27838 -> 0x05a278c8]:
tracemem[0x05a278c8 -> 0x05a0d6f0]:
An assignment doesn't actually copy data (until a modification is made). Rather, it makes a copy of the pointer and indicates that none are singletons:
x <- 1
y <- x
.Internal(inspect(x))
## @5a61848 14 REALSXP g0c1 [NAM(2)] (len=1, tl=0) 1
.Internal(inspect(y))
## @5a61848 14 REALSXP g0c1 [NAM(2)] (len=1, tl=0) 1
y[1] <- 1
.Internal(inspect(y))
## @5a61948 14 REALSXP g0c1 [NAM(1)] (len=1, tl=0) 1
# Note, a new memory address, and NAM(1).