问题
Here's a basic example:
#include <all the basic stuff>
int main(void) {
char *name = (char *) malloc(2 * sizeof(char));
if(name == NULL) {
fprintf(stderr, "Error: Unable to allocate enough memory!\n");
return EXIT_FAILURE;
}
strcpy(name, "Bob Smith");
printf("Name: %s\n", name);
free(name);
return EXIT_SUCCESS;
}
Because I'm only allocating 2 bytes of information (2 chars), there should be some sort of error when I perform strcpy, right? This doesn't happen, instead it just copies the string over, prints it out, frees the memory and exits successfully. Why does this happen, and how can I use malloc correctly?
回答1:
Your program invokes undefined behavior.
Undefined behavior is behavior that's out of the specification of the language. By definition, this means that you are not guaranteed to get any kind of well-defined behavior (such as an error). The program is explicitly invalid.
When you use strcpy
, the function simply assumes that the buffer you pass to it is large enough to hold the string you want to copy. If the assumption is wrong, it attempts to write in an area out of the buffer. And if this happens, the program falls into this case of the C specification, in J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
- Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object
Thus, to use strcpy
correctly, you must manually ensure that the above assumptions about the string's length and the buffer's length hold. To do that, an easy way is to keep the buffer's length saved somewhere, calculate the length of the string you want to copy, and compare them.
For example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
size_t bufferSize = 2 * sizeof(char);
char *name = malloc(bufferSize);
if(name == NULL) {
fprintf(stderr, "Error: Unable to allocate enough memory!\n");
return EXIT_FAILURE;
}
size_t length = strlen("Bob Smith");
if(length + 1 > bufferSize) {
fprintf(stderr, "Error: The target buffer is too small!\n");
return EXIT_FAILURE;
}
strcpy(name, "Bob Smith");
printf("Name: %s\n", name);
free(name);
return EXIT_SUCCESS;
}
As an unrelated side note, you'll notice that I didn't cast the result of malloc, because a void*
is implicitly convertible to char*
.
As a final note:
This aspect of C may sound impractical when you're trying to ensure the correctness of your code (either because you're learning the language or because you intend to release the software).
This is why there are some tools that offer to give you an error whenever your program does something invalid. Valgrind is one such tool (as was mentioned by Jonathan Leffler in the comments).
回答2:
You get an error report if you compile and run it with AddressSanitizer:
$ gcc -g a.c -Wall -Wextra -fsanitize=address
$ ./a.out
=================================================================
==3362==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000eff2 at pc 0x7f9ff2b02dc4 bp 0x7fffe9190650 sp 0x7fffe918fdf8
WRITE of size 10 at 0x60200000eff2 thread T0
#0 0x7f9ff2b02dc3 in __asan_memcpy (/lib64/libasan.so.2+0x8cdc3)
#1 0x4009df in main /home/m/a.c:11
#2 0x7f9ff26d678f in __libc_start_main (/lib64/libc.so.6+0x2078f)
#3 0x400898 in _start (/home/m/a.out+0x400898)
0x60200000eff2 is located 0 bytes to the right of 2-byte region [0x60200000eff0,0x60200000eff2)
allocated by thread T0 here:
#0 0x7f9ff2b0ea0a in malloc (/lib64/libasan.so.2+0x98a0a)
#1 0x400977 in main /home/m/aa.c:6
#2 0x7f9ff26d678f in __libc_start_main (/lib64/libc.so.6+0x2078f)
SUMMARY: AddressSanitizer: heap-buffer-overflow ??:0 __asan_memcpy
回答3:
malloc
will returnnull
if it failed to allocate memory e.g. your system is out of memory. That's unlikely for 2 bytes!- If you copy more bytes than you allocated, you get undefined behaviour. And that undefined behaviour may be that your program behaves as expected!
- As a more general note on the "correct" use of
malloc
that you ask about, I would recommendchar *name = malloc(2 * sizeof(*name));
. It's more concise, it will not hide an error if you forget to includestdlib.h
and it will be easier to change the type ofname
in the future. - Regarding safe use of
strcpy
you shouldn't replace it withstrncpy
as it's unsafe itself if the buffer is not big enough (not null-terminating) and potentially inefficient. Check if your system hasstrcpy_s
orstrlcpy
.
回答4:
There are enough answers, I'll try to get them to a more basic level and give you the following as the root cause:
C does not include any bounds checking whatsoever.
The benefit is, the C runtime is very small and efficient. The backdraw is, you often will not get any error messages for things like you did in your question ... just (maybe long after the error itself) wrong behaviour or even crashes.
回答5:
Why can you write more data to a buffer you malloc()
than the size of the buffer? Outside of the fact that you can't predict the results of undefined behavior, there actually is an explanation as to why it sometimes appears to be totally safe to write more bytes to a malloc()
'd buffer than the number of bytes you asked for.
It's because of the implications of the requirements set in Section 7.20.3 Memory management functions of the C standard:
The order and contiguity of storage allocated by successive calls to the
calloc
,malloc
,andrealloc
functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).
Note the italicized text: "The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object".
Those alignment restrictions mean that malloc()
and related functions effectively have to hand out memory in aligned chunks, and any successful call to malloc()
will be very likely to actually return memory that's in exact multiples of the alignment restrictions malloc()
is operating under.
On an x86 machine, which IIRC has an 8-byte alignment restriction, a call such as malloc( 11 )
will likely return a pointer to a buffer that's in fact 16 bytes.
That's one reason why overwriting the end of a malloc()
'd buffer sometimes seems to be harmless.
回答6:
Well strcpy(name, "Bob Smith");
will invoke undefined behaviour. name
is not enough to store "Bob Smith"
.
Solution would be -
char a[]="Bob Smith";
char *name = malloc(strlen(a)+1); //you should not cast return of malloc
if(name == NULL) {
fprintf(stderr, "Error: Unable to allocate enough memory!\n");
return EXIT_FAILURE;
}
strncpy(name,a,strlen(a));
回答7:
try:
strncpy(name, "Bob Smith", 2 * sizeof(char));
来源:https://stackoverflow.com/questions/32682603/why-does-calling-malloc-not-make-a-difference