问题
When writing a project, I ran into a strange issue.
This is the minimal code I managed to write to recreate the issue. I am intentionally storing an actual string in the place of something else, with enough space allocated.
// #include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <stddef.h> // For offsetof()
typedef struct _pack{
// The type of `c` doesn't matter as long as it's inside of a struct.
int64_t c;
} pack;
int main(){
pack *p;
char str[9] = "aaaaaaaa"; // Input
size_t len = offsetof(pack, c) + (strlen(str) + 1);
p = malloc(len);
// Version 1: crash
strcpy((char*)&(p->c), str);
// Version 2: crash
strncpy((char*)&(p->c), str, strlen(str)+1);
// Version 3: works!
memcpy((char*)&(p->c), str, strlen(str)+1);
// puts((char*)&(p->c));
free(p);
return 0;
}
The above code is confusing me:
- With
gcc/clang -O0
, bothstrcpy()
andmemcpy()
works on Linux/WSL, and theputs()
below gives whatever I entered. - With
clang -O0
on OSX, the code crashes withstrcpy()
. - With
gcc/clang -O2
or-O3
on Ubuntu/Fedora/WSL, the code crashes (!!) atstrcpy()
, whilememcpy()
works well. - With
gcc.exe
on Windows, the code works well whatever the optimization level is.
Also I found some other traits of the code:
- (It looks like) the minimum input to reproduce the crash is 9 bytes (including zero terminator), or
1+sizeof(p->c)
. With that length (or longer) a crash is guaranteed (Dear me ...). - Even if I allocate extra space (up to 1MB) in
malloc()
, it doesn't help. The above behaviors don't change at all. strncpy()
behaves exactly the same, even with the correct length supplied to its 3rd argument.- The pointer does not seem to matter. If structure member
char *c
is changed intolong long c
(orint64_t
), the behavior remains the same. (Update: changed already). The crash message doesn't look regular. A lot of extra info is given along.
I tried all these compilers and they made no difference:
- GCC 5.4.0 (Ubuntu/Fedora/OS X/WSL, all are 64-bit)
- GCC 6.3.0 (Ubuntu only)
- GCC 7.2.0 (Android, norepro???) (This is the GCC from C4droid)
- Clang 5.0.0 (Ubuntu/OS X)
- MinGW GCC 6.3.0 (Windows 7/10, both x64)
Additionally, this custom string copy function, which looks exactly like the standard one, works well with any compiler configuration mentioned above:
char* my_strcpy(char *d, const char* s){
char *r = d;
while (*s){
*(d++) = *(s++);
}
*d = '\0';
return r;
}
Questions:
- Why does
strcpy()
fail? How can it? - Why does it fail only if optimization is on?
- Why doesn't
memcpy()
fail regardless of-O
level??
*If you want to discuss about struct member access violation, pleast head over here.
Part of objdump -d
's output of a crashing executable (on WSL):
P.S. Initially I want to write a structure, the last item of which is a pointer to a dynamically allocated space (for a string). When I write the struct to file, I can't write the pointer. I must write the actual string. So I came up with this solution: force store a string in the place of a pointer.
Also please don't complain about gets()
. I don't use it in my project, but the example code above only.
回答1:
I reproduced this issue on my Ubuntu 16.10 and I found something interesting.
When compiled with gcc -O3 -o ./test ./test.c
, the program will crash if the input is longer than 8 bytes.
After some reversing I found that GCC replaced strcpy
with memcpy_chk, see this.
// decompile from IDA
int __cdecl main(int argc, const char **argv, const char **envp)
{
int *v3; // rbx
int v4; // edx
unsigned int v5; // eax
signed __int64 v6; // rbx
char *v7; // rax
void *v8; // r12
const char *v9; // rax
__int64 _0; // [rsp+0h] [rbp+0h]
unsigned __int64 vars408; // [rsp+408h] [rbp+408h]
vars408 = __readfsqword(0x28u);
v3 = (int *)&_0;
gets(&_0, argv, envp);
do
{
v4 = *v3;
++v3;
v5 = ~v4 & (v4 - 16843009) & 0x80808080;
}
while ( !v5 );
if ( !((unsigned __int16)~(_WORD)v4 & (unsigned __int16)(v4 - 257) & 0x8080) )
v5 >>= 16;
if ( !((unsigned __int16)~(_WORD)v4 & (unsigned __int16)(v4 - 257) & 0x8080) )
v3 = (int *)((char *)v3 + 2);
v6 = (char *)v3 - __CFADD__((_BYTE)v5, (_BYTE)v5) - 3 - (char *)&_0; // strlen
v7 = (char *)malloc(v6 + 9);
v8 = v7;
v9 = (const char *)_memcpy_chk(v7 + 8, &_0, v6 + 1, 8LL); // Forth argument is 8!!
puts(v9);
free(v8);
return 0;
}
Your struct pack makes GCC believe that the element c
is exactly 8 bytes long.
And memcpy_chk
will fail if the copying length is larger than the forth argument!
So there are 2 solutions:
Modify your structure
Using compile options
-D_FORTIFY_SOURCE=0
(likesgcc test.c -O3 -D_FORTIFY_SOURCE=0 -o ./test
) to turn off fortify functions.Caution: This will fully disable buffer overflow checking in the whole program!!
回答2:
What you are doing is undefined behavior.
The compiler is allowed to assume that you will never use more than sizeof int64_t
for the variable member int64_t c
. So if you try to write more than sizeof int64_t
(aka sizeof c
) on c
, you will have an out-of-bounds problem in your code. This is the case because sizeof "aaaaaaaa"
> sizeof int64_t
.
The point is, even if you allocate the correct memory size using malloc()
, the compiler is allowed to assume you will never use more than sizeof int64_t
in your strcpy()
or memcpy()
call. Because you send the address of c
(aka int64_t c
).
TL;DR: You are trying to copy 9 bytes to a type consisting of 8 bytes (we suppose that a byte is an octet). (From @Kcvin)
If you want something similar use flexible array members from C99:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
size_t size;
char str[];
} string;
int main(void) {
char str[] = "aaaaaaaa";
size_t len_str = strlen(str);
string *p = malloc(sizeof *p + len_str + 1);
if (!p) {
return 1;
}
p->size = len_str;
strcpy(p->str, str);
puts(p->str);
strncpy(p->str, str, len_str + 1);
puts(p->str);
memcpy(p->str, str, len_str + 1);
puts(p->str);
free(p);
}
Note: For standard quote please refer to this answer.
回答3:
No answer has yet talked in detail about why this code may or may not be undefined behaviour.
The standard is underspecified in this area, and there is a proposal active to fix it. Under that proposal, this code would NOT be undefined behaviour, and the compilers generating code that crashes would fail to comply with the updated standard. (I revisit this in my concluding paragraph below).
But note that based on the discussion of -D_FORTIFY_SOURCE=2
in other answers, it seems this behaviour is intentional on the part of the developers involved.
I'll talk based on the following snippet:
char *x = malloc(9);
pack *y = (pack *)x;
char *z = (char *)&y->c;
char *w = (char *)y;
Now, all three of x
z
w
refer to the same memory location, and would have the same value and the same representation. But the compiler treats z
differently to x
. (The compiler also treats w
differently to one of those two, although we don't know which as OP didn't explore that case).
This topic is called pointer provenance. It means the restriction on which object a pointer value may range over. The compiler is taking z
as having a provenance only over y->c
, whereas x
has provenance over the entire 9-byte allocation.
The current C Standard does not specify provenance very well. The rules such as pointer subtraction may only occur between two pointers to the same array object is an example of a provenance rule. Another provenance rule is the one that applies to the code we are discussing, C 6.5.6/8:
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression
P
points to thei
-th element of an array object, the expressions(P)+N
(equivalently,N+(P)
) and(P)-N
(whereN
has the valuen
) point to, respectively, thei+n
-th andi−n
-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression(P)+1
points one past the last element of the array object, and if the expressionQ
points one past the last element of an array object, the expression(Q)-1
points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary*
operator that is evaluated.
The justification for bounds-checking of strcpy
, memcpy
always comes back to this rule - those functions are defined to behave as if they were a series of character assignments from a base pointer that's incremented to get to the next character, and the increment of a pointer is covered by (P)+1
as discussed in this rule.
Note that the term "the array object" may apply to an object that wasn't declared as an array. This is spelled out in 6.5.6/7:
For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
The big question here is: what is "the array object"? In this code, is it y->c
, *y
, or the actual 9-byte object returned by malloc?
Crucially, the standard sheds no light whatsoever on this matter. Whenever we have objects with subobjects, the standard does not say whether 6.5.6/8 is referring to the object or the subobject.
A further complicating factor is that the standard does not provide a definition for "array", nor for "array object". But to cut a long story short, the object allocated by malloc
is described as "an array" in various places in the standard, so it does seem that the 9-byte object here is a valid candidate for "the array object". (In fact this is the only such candidate for the case of using x
to iterate over the 9-byte allocation, which I think everyone would agree is legal).
Note: this section is very speculative and I attempt to provide an argument as to why the solution chosen by the compilers here is not self-consistent
An argument could be made that &y->c
means the provenance is the int64_t
subobject. But this does immediately lead to difficulty. For example, does y
have the provenance of *y
? If so, (char *)y
should have the the provenance *y
still, but then this contradicts the rule of 6.3.2.3/7 that casting a pointer to another type and back should return the original pointer (as long as alignment is not violated).
Another thing it doesn't cover is overlapping provenance. Can a pointer compare unequal to a pointer of the same value but a smaller provenance (which is a subset of the larger provenance) ?
Further, if we apply that same principle to the case where the subobject is an array:
char arr[2][2];
char *r = (char *)arr;
++r; ++r; ++r; // undefined behavior - exceeds bounds of arr[0]
arr
is defined as meaning &arr[0]
in this context, so if the provenance of &X
is X
, then r
is actually bounded to just the first row of the array -- perhaps a surprising result.
It would be possible to say that char *r = (char *)arr;
leads to UB here, but char *r = (char *)&arr;
does not. In fact I used to promote this view in my posts many years ago. But I no longer do: in my experience of trying to defend this position, it just can't be made self-consistent, there are too many problem scenarios. And even if it could be made self-consistent, the fact remains that the standard doesn't specify it. At best, this view should have the status of a proposal.
To finish up, I would recommend reading N2090: Clarifying Pointer Provenance (Draft Defect Report or Proposal for C2x).
Their proposal is that provenance always applies to an allocation. This renders moot all the intricacies of objects and subobjects. There are no sub-allocations. In this proposal, all of x
z
w
are identical and may be used to range over the whole 9-byte allocation. IMHO the simplicity of this is appealing, compared to what was discussed in my previous section.
回答4:
This is all because of -D_FORTIFY_SOURCE=2
intentionally crashing on what it decides is unsafe.
Some distros build gcc with -D_FORTIFY_SOURCE=2
enabled by default. Some don't. This explains all the differences between different compilers. Probably the ones that don't crash normally will if you build your code with -O3 -D_FORTIFY_SOURCE=2
.
Why does it fail only if optimization is on?
_FORTIFY_SOURCE
requires compiling with optimization (-O
) to keep track of object sizes through pointer casts / assignments. See the slides from this talk for more about _FORTIFY_SOURCE
.
Why does strcpy() fail? How can it?
gcc calls __memcpy_chk
for strcpy
only with -D_FORTIFY_SOURCE=2
. It passes 8
as the size of the target object, because that's what it thinks you mean / what it can figure out from the source code you gave it. Same deal for strncpy
calling __strncpy_chk
.
__memcpy_chk
aborts on purpose. _FORTIFY_SOURCE
may be going beyond things that are UB in C and disallowing things that look potentially dangerous. This gives it license to decide that your code is unsafe. (As others have pointed out, a flexible array member as the last member of your struct, and/or a union with a flexible-array member, is how you should express what you're doing in C.)
gcc even warns that the check will always fail:
In function 'strcpy',
inlined from 'main' at <source>:18:9:
/usr/include/x86_64-linux-gnu/bits/string3.h:110:10: warning: call to __builtin___memcpy_chk will always overflow destination buffer
return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(from gcc7.2 -O3 -Wall on the Godbolt compiler explorer).
Why doesn't
memcpy()
fail regardless of-O
level?
IDK.
gcc fully inlines it just an 8B load/store + a 1B load/store. (Seems like a missed optimization; it should know that malloc didn't modify it on the stack, so it could just store it from immediates again instead of reloading. (Or better keep the 8B value in a register.)
回答5:
why making things complicated? Overcomplexifying like you're doing gives just more space for undefined behaviour, in that part:
memcpy((char*)&p->c, str, strlen(str)+1);
puts((char*)&p->c);
warning: passing argument 1 of 'puts' from incompatible pointer ty pe [-Wincompatible-pointer-types] puts(&p->c);
you're clearly ending up in an unallocated memory area or somewhere writable if you're lucky...
Optimizing or not may change the values of the addresses, and it may work (since the addresses match), or not. You just cannot do what you want to do (basically lying to the compiler)
I would:
- allocate just what's needed for the struct, don't take the length of the string inside into account, it's useless
- don't use
gets
as it's unsafe and obsolescent - use
strdup
instead of the bug-pronememcpy
code you're using since you're handling strings.strdup
won't forget to allocate the nul-terminator, and will set it in the target for you. - don't forget to free the duplicated string
- read the warnings,
put(&p->c)
is undefined behaviour
test.c:19:10: warning: passing argument 1 of 'puts' from incompatible pointer ty pe [-Wincompatible-pointer-types] puts(&p->c);
My proposal
int main(){
pack *p = malloc(sizeof(pack));
char str[1024];
fgets(str,sizeof(str),stdin);
p->c = strdup(str);
puts(p->c);
free(p->c);
free(p);
return 0;
}
回答6:
Your pointer p->c is the cause of crash.
First initialize struct with size of "unsigned long long" plus size of "*p".
Second initialize pointer p->c with the required area size.
Make operation copy: strcpy(p->c, str);
Finally free first free(p->c) and free(p).
I think it was this.
[EDIT]
I'll insist.
The cause of the error is that its structure only reserves space for the pointer but does not allocate the pointer to contain the data that will be copied.
Take a look
int main() { pack *p; char str[1024]; gets(str); size_t len_struc = sizeof(*p) + sizeof(unsigned long long); p = malloc(len_struc); p->c = malloc(strlen(str)); strcpy(p->c, str); // This do not crashes! puts(&p->c); free(p->c); free(p); return 0; }
[EDIT2]
This is not a traditional way to store data but this works:
pack2 *p; char str[9] = "aaaaaaaa"; // Input size_t len = sizeof(pack) + (strlen(str) + 1); p = malloc(len); // Version 1: crash strcpy((char*)p + sizeof(pack), str); free(p);
来源:https://stackoverflow.com/questions/47220212/strcpy-strncpy-crashes-on-structure-member-with-extra-space-when-optimizatio