Is malloc deterministic? Say If I have a forked process, that is, a replica of another process, and at some point both of them call the malloc
The C99 spec (at least, in its final public draft) states in 'J.1 Unspecified behavior':
The following are unspecified: ... The order and contiguity of storage allocated by successive calls to the calloc, malloc, and realloc functions (7.20.3).
So it would seem that malloc doesn't have to be deterministic. It therefore isn't safe to assume that it is.
Yes, it's deterministic to some degree, but not that doesn't necessarily mean it'll given identical results in two forks of a process.
Just for example, the Single Unix Specification says: "[...] to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called."
For better or worse, malloc
is not in the list of "async-signal-safe" functions.
This limitation is in a section that discusses multithreaded programs, but doesn't specify whether the limitation applies only to multithreaded programs, or also applies to single threaded programs.
Conclusion: you can't count on malloc
producing identical results in the parent and the child. If the program is multithreaded, you can't count on malloc
working at all in the child, until it has called exec
--and there's room for reasonable question whether it's actually guaranteed to work even in a single-threaded child before the child calls exec
.
References:
It depends on the detailed implementations of malloc
. A typical malloc
implementation (e.g., dlmalloc) used to be deterministic. This is simply because the algorithm itself is deterministic.
However, due to many security attacks such as heap overflow attacks, malloc
, that is a heap manager, introduced some randomness in their implementations. (But, its entropy is relatively small because heap managers must consider speed and space) So, it is safe that you should not assume rigorous determinism in a heap managers.
Also, when you fork a process, there are various sources of randomness including ASLR.
Technically, if the forked processes both request the same size of block, they should get the same address allocated, but each of those addresses will point to a different physical/real memory location.
Linux uses copy-on-write for fork, so forked children share their parent's memory, until something is changed in either process. At that point the kernel goes through the memory copying sequence to give the forked child it's own dedicated/unique copy of its memory space.
You won't get the same physical address. If you have process A and B each call of malloc returns the address of a free block. The order in which A and B calls malloc is not predictable. But it never happens "in the same moment".
There is no reason at all for it to be deterministic, in fact there can be some benefit to it not being deterministic, for example increasing the complexity of exploiting bugs (see also this paper).
This randomness can be helpful at making exploits harder to write. To successfully exploit a buffer overflow you typically need to do two things:
If the memory location is unpredictable making that jump can become quite a lot harder.
The relevant quote from the standard §7.20.3.3/2:
The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate
If it were the intention to make it deterministic then that would be clearly stated as such.
Even if it looks deterministic today I wouldn't bet on it remaining so with a newer kernel or a newer libc/GCC version.