Why does `free` in C not take the number of bytes to be freed?

前端 未结 12 1594
情歌与酒
情歌与酒 2020-12-12 20:11

Just to be clear: I do know that malloc and free are implemented in the C library, which usually allocates chunks of memory from the OS and does it

相关标签:
12条回答
  • 2020-12-12 20:54

    C may not be as "abstract" as C++, but it's still intended to be an abstraction over assembly. To that end, the lowest-level details are taken out of the equation. This prevents you from having to furtle about with alignment and padding, for the most part, which would make all your C programs non-portable.

    In short, this is the entire point of writing an abstraction.

    0 讨论(0)
  • 2020-12-12 20:54

    malloc and free go hand in hand, with each "malloc" being matched by one "free". Thus it makes total sense that the "free" matching a previous "malloc" should simply free up the amount of memory allocated by that malloc - this is the majority use case that would make sense in 99% of cases. Imagine all the memory errors if all uses of malloc/free by all programmers around the world ever, would need the programmer to keep track of the amount allocated in malloc, and then remember to free the same. The scenario you talk about should really be using multiple mallocs/frees in some kind of memory management implementation.

    0 讨论(0)
  • 2020-12-12 20:55

    Why does free in C not take the number of bytes to be freed?

    Because it doesn't need to. The information is already available in the internal management performed by malloc/free.

    Here are two considerations (that may or may not have contributed to this decision):

    • Why would you expect a function to receive a parameter it doesn't need?

      (this would complicate virtually all client code relying on dynamic memory, and add completely unnecessary redundancy to your application). Keeping track of pointer allocation is already a dificult problem. Keeping track of memory allocations along with associated sizes would increase the complexity of client code unnecessarily.

    • What would the altered free function do, in these cases?

      void * p = malloc(20);
      free(p, 25); // (1) wrong size provided by client code
      free(NULL, 10); // (2) generic argument mismatch
      

      Would it not free (cause a memory leak?)? Ignore the second parameter? Stop the application by calling exit? Implementing this would add extra failure points in your application, for a feature you probably don't need (and if you need it, see my last point, below - "implementing solution at application level").

    Rather, I want to know why free was made this way in the first place.

    Because this is the "proper" way to do it. An API should require the arguments it needs to perform it's operation, and no more than that.

    It also occurs to me that explicitly giving the number of bytes to free might allow for some performance optimisations, e.g. an allocator that has separate pools for different allocation sizes would be able to determine which pool to free from just by looking at the input arguments, and there would be less space overhead overall.

    The proper ways to implement that, are:

    • (at the system level) within the implementation of malloc - there is nothing stopping the library implementer from writing malloc to use various strategies internally, based on received size.

    • (at application level) by wrapping malloc and free within your own APIs, and using those instead (everywhere in your application that you may need).

    0 讨论(0)
  • 2020-12-12 20:57

    Actually, in the ancient Unix kernel memory allocator, mfree() took a size argument. malloc() and mfree() kept two arrays (one for core memory, another one for swap) that contained information on free block addresses and sizes.

    There was no userspace allocator until Unix V6 (programs would just use sbrk()). In Unix V6, iolib included an allocator with alloc(size) and a free() call which did not take a size argument. Each memory block was preceded by its size and a pointer to the next block. The pointer was only used on free blocks, when walking the free list, and was reused as block memory on in-use blocks.

    In Unix 32V and in Unix V7, this was substituted by a new malloc() and free() implementation, where free() did not take a size argument. The implementation was a circular list, each chunk was preceded by a word that contained a pointer to the next chunk, and a "busy" (allocated) bit. So, malloc()/free() didn't even keep track of an explicit size.

    0 讨论(0)
  • 2020-12-12 20:57

    Five reasons spring to mind:

    1. It's convenient. It removes a whole load of overhead from the programmer and avoids a class of extremely difficult to track errors.

    2. It opens up the possibility of releasing part of a block. But since memory managers usually want to have tracking information it isn't clear what this would mean?

    3. Lightness Races In Orbit is spot on about padding and alignment. The nature of memory management means that the actual size allocated is quite possibly different from the size you asked for. This means that were free to require a size as well as a location malloc would have to be changed to return the actual size allocated as well.

    4. It's not clear that there is any actual benefit to passing in the size, anyway. A typical memory manager has 4-16 bytes of header for each chunk of memory, which includes the size. This chunk header can be common for allocated and unallocated memory and when adjacent chunks come free they can be collapsed together. If you're making the caller store the free memory you can free up probably 4 bytes per chunk by not having a separate size field in allocated memory but that size field is probably not gained anyway since the caller needs to store it somewhere. But now that information is scattered in memory rather than being predictably located in the header chunk which is likely to be less operationally efficient anyway.

    5. Even if it was more efficient it's radically unlikely your program is spending a large amount of time freeing memory anyway so the benefit would be tiny.

    Incidentally, your idea about separate allocators for different size items is easily implemented without this information (you can use the address to determine where the allocation occurred). This is routinely done in C++.

    Added later

    Another answer, rather ridiculously, has brought up std::allocator as proof that free could work this way but, in fact, it serves as a good example of why free doesn't work this way. There are two key differences between what malloc/free do and what std::allocator does. Firstly, malloc and free are user facing - they're designed for the general programmers to work with - whereas std::allocator is designed to provide specialist memory allocation to the standard library. This provides a nice example of when the first of my points doesn't, or wouldn't, matter. Since it's a library, the difficulties of handling the complexities of tracking size are hidden from the user anyway.

    Secondly, std::allocator always works with the same size item this means that it is possible for it to use the originally passed number of elements to determine how much of free. Why this differs from free itself is illustrative. In std::allocator the items to be allocated are always of the same, known, size and always the same kind of item so they always have the same kind of alignment requirements. This means that the allocator could be specialised to simply allocate an array of these items at the start and dole them out as needed. You couldn't do this with free because there is no way to guarantee that the best size to return is the size asked for, instead it is much more efficient to sometimes return larger blocks than the caller asks for* and thus either the user or the manager needs to track the exact size actually granted. Passing these kinds of implementation details onto the user is a needless headache that gives no benefit to the caller.

    -* If anyone is still having difficultly understanding this point, consider this: a typical memory allocator adds a small amount of tracking information to the start of a memory block and then returns a pointer offset from this. Information stored here typically includes pointers to the next free block, for example. Let's suppose that header is a mere 4 bytes long (which is actually smaller than most real libraries), and doesn't include the size, then imagine we have a 20 byte free block when the user asks for a 16 byte block, a naive system would return the 16byte block but then leave a 4byte fragment that could never, ever be used wasting time every time malloc gets called. If instead the manager simply returns the 20 byte block then it saves these messy fragments from building up and is able to more cleanly allocate the available memory. But if the system is to correctly do this without tracking the size itself we then require the user to track - for every, single allocation - the amount of memory actually allocated if it is to pass it back for free. The same argument applies to padding for types/allocations that don't match the desired boundaries. Thus, at most, requiring free to take a size is either (a) completely useless since the memory allocator can't rely on the passed size to match the actually allocated size or (b) pointlessly requires the user to do work tracking the real size that would be easily handled by any sensible memory manager.

    0 讨论(0)
  • 2020-12-12 21:00

    Why should it? malloc() and free() are intentionally very simple memory management primitives, and higher-level memory management in C is largely up to the developer. T

    Moreover realloc() does that already - if you reduce the allocation in realloc() is it will not move the data, and the pointer returned will be the the same as the original.

    It is generally true of the entire standard library that it is composed of simple primitives from which you can build more complex functions to suit your application needs. So the answer to any question of the form "why does the standard library not do X" is because it cannot do everything a programmer might think of (that's what programmers are for), so it chooses to do very little - build your own or use third-party libraries. If you want a more extensive standard library - including more flexible memory management, then C++ may be the answer.

    You tagged the question C++ as well as C, and if C++ is what you are using, then you should hardly be using malloc/free in any case - apart from new/delete, STL container classes manage memory automatically, and in a manner likely to be specifically appropriate to the nature of the various containers.

    0 讨论(0)
提交回复
热议问题