strdup invalid read of size 4 when string literal is ending with newline \\n

夙愿已清 提交于 2019-12-01 10:54:10

It's not about the newline character, nor the printf format specifier. You've found what is arguably a bug in strlen(), and I can tell you must be using gcc.

Your program code is perfectly fine. The printf format specifier could be a little better, but it won't cause the valgrind error you are seeing. Let's look at that valgrind error:

==18929== Invalid read of size 4
==18929==    at 0x804847E: main (in /tmp/test)
==18929==  Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd
==18929==    at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==18929==    by 0x8048415: main (in /tmp/test)

"Invalid read of size 4" is the first message we must understand. It means that the processor ran an instruction which would load 4 consecutive bytes from memory. The next line indicates that the address attempted to be read was "Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd."

With this information, we can figure it out. First, if you replace that '\n' with a '$', or any other character, the same error will be produced. Try it.

Secondly, we can see that your string has 40 characters in it. Adding the \0 termination character brings the total bytes used to represent the string to 41.

Because we have the message "Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd," we now know everything about what is going wrong.

  1. strdup() allocated the correct amount of memory, 41 bytes.
  2. strlen() attempted to read 4 bytes, starting at the 40th, which would extend to a non-existent 43rd byte.
  3. valgrind caught the problem

This is a glib() bug. Once upon a time, a project called Tiny C Compiler (TCC) was starting to take off. Coincidentally, glib was completely changed so that the normal string functions, such as strlen() no longer existed. They were replaced with optimized versions which read memory using various methods such as reading four bytes at a time. gcc was changed at the same time to generate calls to the appropriate implementations, depending on the alignment of the input pointer, the hardware compiled for, etc. The TCC project was abandoned when this change to the GNU environment made it so difficult to produce a new C compiler, by taking away the ability to use glib for the standard library.

If you report the bug, glib maintainers probably won't fix it. The reason is that under practical use, this will likely never cause an actual crash. The strlen function is reading bytes 4 at a time because it sees that the addresses are 4-byte aligned. It's always possible to read 4 bytes from a 4-byte-aligned address without segfaulting, given that reading 1 byte from that address would succeed. Therefore, the warning from valgrind doesn't reveal a potential crash, just a mismatch in assumptions about how to program. I consider valgrind technically correct, but I think there is zero chance that glib maintainers will do anything to squelch the warning.

The error message seems to indicate that it's strlen that read past the malloced buffer allocated by strdup. On a 32-bit platform, an optimal strlen implementation could read 4 bytes at a time into a 32-bit register and do some bit-twiddling to see if there's a null byte in there. If near the end of the string, there are less than 4 bytes left, but 4 bytes are still read to perform the null byte check, then I could see this error getting printed. In that case, presumably the strlen implementer would know if it's "safe" to do this on the particular platform, in which case the valgrind error is a false positive.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!