I am reading Hadley\'s Advanced R Programming and when it discusses the memory size for characters it says this:
R has a global string pool. This mean
The key difference is because of the pointers in vec
: each of the short scalar strings (CHARSXPs) has to be pointed from the corresponding string vector (STRSXP). You have some 1326 of such string pointers inside vec
, but only 51 in str
(a pointer is probably 8 bytes on your platform). The pool is for scalar strings (aka CHARSXP cache). Another non-obvious factor is internal fragmentation, e.g. on my system, a scalar string takes the same size regardless of whether it has zero to 7 characters, an 8 character string only takes more, and so on. See the repeated sizes in the following:
unlist(sapply(str, object.size))
[1] 96 96 96 104 104 104 104 120 120 120 120 120 120 120 120 136 136 136 136
[20] 136 136 136 136 152 152 152 152 152 152 152 152 216 216 216 216 216 216 216
[39] 216 216 216 216 216 216 216 216 216 216 216 216 216
These are, however, implementation details of R's memory manager that could change and one should not depend on them in any way in user programs - with another object layout/memory manager, str
could use more space than vec
.