realloc but only first few bytes is meaningful

问题

Assume I have used ptr = malloc(old_size); to allocate a memory block with old_size bytes. Only the first header_size bytes is meaningful. I'm going to increase the size to new_size.

new_size is greater than old_size and old_size is greater than header_size.

before:

/- - - - - - - old_size - - - - - - - \
+===============+---------------------+
 \-header_size-/

after:

/- - - - - - - - - - - - - - - new_size - - - - - - - - - - - - - - - - - - -\
+===============+------------------------------------------------------------+
\- header_size-/

I don't care what is stored after ptr + header_size because I'll read some data to there.

method 1: go straight to `new_size`

ptr = realloc(ptr, new_size);

method 2: shrink to `header_size` and grow to `new_size`

ptr = realloc(ptr, header_size);
ptr = realloc(ptr, new_size);

method 3: allocate a new memory block and copy the first `header_size` bytes

void *newptr = malloc(new_size);
memcpy(newptr, ptr, header_size);
free(ptr);
ptr = newptr;

Which is faster?

回答1:

It almost certainly depends on the values of old_size, new_size and header_size, and also it depends on the implementation. You'd have to pick some values and measure.

1) is probably best in the case where header_size == old_size-1 && old_size == new_size-1, since it gives you the best chance of the single realloc being basically a no-op. (2) should be only very slightly slower in that case (2 almost-no-ops being marginally slower than 1).

3) is probably best in the case where header_size == 1 && old_size == 1024*1024 && new_size == 2048*1024, because the realloc would have to move the allocation, but you avoid copying 1MB of data you don't care about. (2) should be only very slightly slower in that case.

2) is probably best when header_size is much smaller than old_size, and new_size is in a range where it's reasonably likely that the realloc will relocate, but also reasonably likely that it won't. Then you can't predict which of (1) and (3) it is that will be very slightly faster than (2).

In analyzing (2), I have assumed that realloc downwards is approximately free and returns the same pointer. This is not guaranteed. I can think of two things that can mess you up:

realloc downwards copies to a new allocation
realloc downwards splits the buffer to create a new chunk of free memory, but then when you realloc back up again the allocator doesn't merge that new free chunk straight back onto your buffer again in order to return without copying.

Either of those could make (2) significantly more expensive than (1). So it's an implementation detail whether or not (2) is a good way of hedging your bets between the advantages of (1) (sometimes avoids copying anything) and the advantages of (3) (sometimes avoids copying too much).

Btw, this kind of idle speculation about performance is more effective in order to tentatively explain your observations, than it is to tentatively predict what observations we would make in the unlikely event that we actually cared enough about performance to test it.

Furthermore, I suspect that for large allocations, the implementation might be able to do even a relocating realloc without copying anything, by re-mapping the memory to a new address. In which case they would all be fast. I haven't looked into whether implementations actually do that, though.

回答2:

Neither malloc (for the whole block) nor realloc (for the space beyond the size of the old block when increasing the size) guarantee what the memory you receive will contain so, if you want those excess bytes set to zero (for example), you'll have to do it yourself with something like:

// ptr contains current block.
void *saveptr = ptr;
ptr = realloc (ptr, new_size);
if (ptr == NULL) {
    // do something intelligent like recover saveptr and exit.
}
memset (ptr + header_size, 0, new_size - header_size);

However, since you've stated that you don't care about the content beyond the header, the fastest is almost certainly a single realloc since that's likely to be optimised under the covers.

Calling it twice for contraction and expansion, or calling malloc-new/memcpy/free-old is very unlikely to be as efficient though, as with all optimisations, you should measure, don't guess!

Keep in mind that realloc doesn't necessarily have to copy your memory at all. If the expansion can be done in place, then an intelligent heap manager will just increase the size of the block without copying anything, such as:

+-----------+   ^        +-----------+ <- At same address,
| Old block |   | Need   | New block |      no copying
|           |   | this   |           |      involved.
+-----------+   | much   |           |
| Free      |   | now.   |           |
|           |   v        +-----------+
|           |            | Free      |
|           |            |           |
+-----------+            +-----------+

回答3:

That probably depends on what the sizes are and if copying is needed.

Method 1 will copy everything contained in the old block - but if you don't do that too often, you won't notice.

Method 2 will only copy what you need to keep, as you discard everything else beforehand.

Method 3 will copy unconditionally, while the others only copy if the memory block cannot be resized where it is.

Personally, I would prefer method 2 if you do this quite often, or method 1 if you do it more seldom. Respectively, I would profile which of these will be faster.

来源：https://stackoverflow.com/questions/13247526/realloc-but-only-first-few-bytes-is-meaningful

标签

memory-management

malloc

realloc

libc