given:
-record(foo, {a, b, c}).
I do something like this:
Thing = #foo{a={1,2}, b={3,4}, c={5,6}},
Thing1 = Thing#foo{a={7,8}}.
The exact workings of the garbage collection and memory allocation is only known to a few. Thankfully, they are very happy to share their knowledge and the following is based on what I have learnt from the erlang-questions mailing list and by discussing with OTP developers.
When messaging between processes, the content is always copied as there is no shared heap between processes. The only exception is binaries bigger than 64 bytes, where only a reference is copied.
When executing code in one process, only parts are updated. Let's analyze tuples, as that is the example you provided.
A tuple is actually a structure that keeps references to the actual data somewhere on the heap (except for small integers and maybe one more data type which I can't remember). When you update a tuple, using for example setelement/3
, a new tuple is created with the given element replaced, however for all other elements only the reference is copied. There is one exception which I have never been able to take advantage of.
The garbage collector keeps track of each tuple and understands when it is safe to reclaim any tuple that is no longer used. It might be that the data referenced by the tuple is still in use, in which case the data itself is not collected.
As always, Erlang gives you some tools to understand exactly what is going on. The efficiency guide details how to use erts_debug:size/1
and erts_debug:flat_size/1
to understand the size of the data structure when used internally in a process and when copied. The trace tools also allows you to understand when, what and how much was garbage collected.