问题
I've been thinking of what will happen if I assign a longer string literal to a char array of smaller size. (I understand that if I use a string literal as an initializer, I would probably leave out the size and let the compiler count the number of chars, or use strlen()+1 as the size. )
I have the following code:
#include <stdio.h>
int main() {
char a[3] = "abc"; // a[2] gives an error of initializer-string for array of chars is too long
printf("%s\n", a);
printf("%p\n", a);
}
I expect it to crash but it actually compiles without warning and can print things out. But using valgrind, I get the following error messages.
==19195== Memcheck, a memory error detector
==19195== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==19195== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==19195== Command: ./a.out
==19195==
==19195== Conditional jump or move depends on uninitialised value(s)
==19195== at 0x4E88CC0: vfprintf (vfprintf.c:1632)
==19195== by 0x4E8F898: printf (printf.c:33)
==19195== by 0x4005CC: main (main.c:5)
==19195==
==19195== Conditional jump or move depends on uninitialised value(s)
==19195== at 0x4EB475D: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:850)
==19195== by 0x4EB56AF: _IO_default_xsputn (genops.c:455)
==19195== by 0x4EB32C6: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1352)
==19195== by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195== by 0x4E8F898: printf (printf.c:33)
==19195== by 0x4005CC: main (main.c:5)
==19195==
==19195== Conditional jump or move depends on uninitialised value(s)
==19195== at 0x4EB478A: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:858)
==19195== by 0x4EB56AF: _IO_default_xsputn (genops.c:455)
==19195== by 0x4EB32C6: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1352)
==19195== by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195== by 0x4E8F898: printf (printf.c:33)
==19195== by 0x4005CC: main (main.c:5)
==19195==
==19195== Conditional jump or move depends on uninitialised value(s)
==19195== at 0x4EB56B3: _IO_default_xsputn (genops.c:455)
==19195== by 0x4EB32C6: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1352)
==19195== by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195== by 0x4E8F898: printf (printf.c:33)
==19195== by 0x4005CC: main (main.c:5)
==19195==
==19195== Syscall param write(buf) points to uninitialised byte(s)
==19195== at 0x4F306E0: __write_nocancel (syscall-template.S:84)
==19195== by 0x4EB2BFE: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1263)
==19195== by 0x4EB4408: new_do_write (fileops.c:518)
==19195== by 0x4EB4408: _IO_do_write@@GLIBC_2.2.5 (fileops.c:494)
==19195== by 0x4EB347C: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1331)
==19195== by 0x4E8792C: vfprintf (vfprintf.c:1663)
==19195== by 0x4E8F898: printf (printf.c:33)
==19195== by 0x4005CC: main (main.c:5)
==19195== Address 0x5203043 is 3 bytes inside a block of size 1,024 alloc'd
==19195== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19195== by 0x4EA71D4: _IO_file_doallocate (filedoalloc.c:127)
==19195== by 0x4EB5593: _IO_doallocbuf (genops.c:398)
==19195== by 0x4EB48F7: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:820)
==19195== by 0x4EB328C: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1331)
==19195== by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195== by 0x4E8F898: printf (printf.c:33)
==19195== by 0x4005CC: main (main.c:5)
==19195==
abc?
0xfff0003f0
==19195==
==19195== HEAP SUMMARY:
==19195== in use at exit: 0 bytes in 0 blocks
==19195== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==19195==
==19195== All heap blocks were freed -- no leaks are possible
==19195==
==19195== For counts of detected and suppressed errors, rerun with: -v
==19195== Use --track-origins=yes to see where uninitialised values come from
==19195== ERROR SUMMARY: 10 errors from 5 contexts (suppressed: 0 from 0)
I think the uninitialized value/byte part makes sense because there's no memory allocated for the terminating character '\0', and when I print it out the last char is garbage value.
But the last error message looks unfamiliar to me.
Address 0x5203043 is 3 bytes inside a block of size 1,024 alloc'd
I'm aware that the buffer size is defined as 1024. I'm not sure if this error is here because of inefficient use of memory.
Also I'm wondering where does the heap alloc and free come from? Is that from the string literal?
Thanks for any help!!
(The previous subject of this question might be confusingly worded. I changed it. )
A similar question, but in C++
回答1:
Here's my interpretation of what's going on:
You're writing to stdout
, which is buffered by default. So all data goes into an internal buffer first and is then written ("flushed") to the actual underlying file descriptor.
Your a
array is not a valid string, as it lacks a terminating NUL byte. The first couple of messages come from the printf
internals where it tries to compute the length of the argument string by finding the terminator and copy the contents into stdout
's buffer. As there is no terminator within a
, the code goes out of bounds, reading uninitialized memory.
At this point the output buffer would look like:
char *buf = malloc(1024), contents:
a b c ? ? ? ?
^^^^^ ^^^^^^^
The first part (abc
) was legitimately copied from a
. The next part is random garbage (uninitialized bytes after a
, copied into the buffer). This goes on until a NUL byte happens to occur somewhere after a
, which is then treated as the end of the string (this is where copying from a
stops).
Finally there's the '\n'
from the format string, which is also added to the buffer:
char *buf = malloc(1024), contents:
a b c ? ? ? ? \n
^^^^^ ^^^^^^^ ^^
Then (because we encountered a '\n'
and stdout
is line buffered) we flush the buffer, calling write(STDOUT_FILENO, buf, N)
where N
is however many bytes are in use in the output buffer (this is at least 4 but the exact number depends on how many garbage bytes were copied before a '\0'
was found after a
).
Now, the error:
==19195== Syscall param write(buf) points to uninitialised byte(s)
This is saying that there are uninitialized bytes within the first argument of write
(the buffer).
Apparently valgrind treats parts of the output buffer as uninitialized because the source data was uninitialized. Copying garbage from A to B just means B is also garbage.
==19195== Address 0x5203043 is 3 bytes inside a block of size 1,024 alloc'd
So it's saying that there's a dynamically allocated buffer (of size 1024), and the uninitialised byte(s)
from the previous error were found at offset 3. Which makes sense, because offsets 0, 1, 2 contain "abc"
, which is perfectly valid data. But after that is where the trouble begins.
It's also saying that the block came from malloc
, which was called (indirectly) from printf
. This is because the output buffer of stdout
is created on demand, the first time you write to it. Which is the first printf
call in your main
.
回答2:
assigning string literal “abc” to an array of size 3 causes valgrind error
The assigning does not cause a valgrind error. char a[3] = "abc"
is fine. C allows a character array to be initialized sans null character.
Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array. C11 §6.7.9 14
printf("%s", ...
expects a pointer to a null character terminated array. a
is not that as it lacks a null character. Code is attempting to access beyond a[]
and is undefined behavior, the error comes from that. It is not "inefficient use of memory.", but accessing out of bounds into uninitialized memory.
Instead use the following which prints until a null character is found or 3 characters is printed.
printf("%.3s\n", a);
// or
printf("%.*s\n", (int) sizeof a, a);
来源:https://stackoverflow.com/questions/42560282/c-try-to-assign-string-literal-abc-to-an-array-of-size-3-valgrind-detects-er