How can dereferencing a NULL pointer in C not crash a program?

这一生的挚爱 提交于 2019-12-21 03:37:10

问题


I need help of a real C guru to analyze a crash in my code. Not for fixing the crash; I can easily fix it, but before doing so I'd like to understand how this crash is even possible, as it seems totally impossible to me.

This crash only happens on a customer machine and I cannot reproduce it locally (so I cannot step through the code using a debugger), as I cannot obtain a copy of this user's database. My company also won't allow me to just change a few lines in the code and make a custom build for this customer (so I cannot add some printf lines and have him run the code again) and of course the customer has a build without debug symbols. In other words, my debbuging abilities are very limited. Nonetheless I could nail down the crash and get some debugging information. However when I look at that information and then at the code I cannot understand how the program flow could ever reach the line in question. The code should have crashed long before getting to that line. I'm totally lost here.

Let's start with the relevant code. It's very little code:

// ... code above skipped, not relevant ...

if (data == NULL) return -1;

information = parseData(data);

if (information == NULL) return -1;

/* Check if name has been correctly \0 terminated */
if (information->kind.name->data[information->kind.name->length] != '\0') {
    freeParsedData(information);
    return -1;
}

/* Copy the name */
realLength = information->kind.name->length + 1;
*result = malloc(realLength);
if (*result == NULL) {
    freeParsedData(information);
    return -1;
}
strlcpy(*result, (char *)information->kind.name->data, realLength);

// ... code below skipped, not relevant ...

That's already it. It crashes in strlcpy. I can tell you even how strlcpy is really called at runtime. strlcpy is actually called with the following paramaters:

strlcpy ( 0x341000, 0x0, 0x1 );

Knowing this it is rather obvious why strlcpy crashes. It tries to read one character from a NULL pointer and that will of course crash. And since the last parameter has a value of 1, the original length must have been 0. My code clearly has a bug here, it fails to check for the name data being NULL. I can fix this, no problem.

My question is:
How can this code ever get to the strlcpy in the first place?
Why does this code not crash at the if-statement?

I tried it locally on my machine:

int main (
    int argc,
    char ** argv
) {
    char * nullString = malloc(10);
    free(nullString);
    nullString = NULL;

    if (nullString[0] != '\0') {
        printf("Not terminated\n");
        exit(1);
    }
    printf("Can get past the if-clause\n");

    char xxx[10];
    strlcpy(xxx, nullString, 1);
    return 0;   
}

This code never gets passed the if statement. It crashes in the if statement and that is definitely expected.

So can anyone think of any reason why the first code can get passed that if-statement without crashing if name->data is really NULL? This is totally mysterious to me. It doesn't seem deterministic.

Important extra information:
The code between the two comments is really complete, nothing has been left out. Further the application is single threaded, so there is no other thread that could unexpectedly alter any memory in the background. The platform where this happens is a PPC CPU (a G4, in case that could play any role). And in case someone wonders about "kind.", this is because "information" contains a "union" named "kind" and name is a struct again (kind is a union, every possible union value is a different type of struct); but this all shouldn't really matter here.

I'm grateful for any idea here. I'm even more grateful if it's not just a theory, but if there is a way I can verify that this theory really holds true for the customer.

Solution

I accepted the right answer already, but just in case anyone finds this question on Google, here's what really happened:

The pointers were pointing to memory, that has already been freed. Freeing memory won't make it all zero or cause the process to give it back to the system at once. So even though the memory has been erroneously freed, it was containing the correct values. The pointer in question is not NULL at the time the "if check" is performed.

After that check I allocate some new memory, calling malloc. Not sure what exactly malloc does here, but every call to malloc or free can have far-reaching consequences to all dynamic memory of the virtual address space of a process. After the malloc call, the pointer is in fact NULL. Somehow malloc (or some system call malloc uses) zeros the already freed memory where the pointer itself is located (not the data it points to, the pointer itself is in dynamic memory). Zeroing that memory, the pointer now has a value of 0x0, which is equal to NULL on my system and when strlcpy is called, it will of course crash.

So the real bug causing this strange behavior was at a completely different location in my code. Never forget: Freed memory keeps it values, but it is beyond your control for how long. To check if your app has a memory bug of accessing already freed memory, just make sure the freed memory is always zeroed before it is freed. In OS X you can do this by setting an environment variable at runtime (no need to recompile anything). Of course this slows down the program quite a bit, but you will catch those bugs much earlier.


回答1:


It is possible that the structure is located in memory that has been free()'d, or the heap is corrupted. In that case, malloc() could be modifying the memory, thinking that it is free.

You might try running your program under a memory checker. One memory checker that supports Mac OS X is valgrind, although it supports Mac OS X only on Intel, not on PowerPC.




回答2:


First, dereferencing a null pointer is undefined behavior. It can crash, not crash, or set your wallpaper to a picture of SpongeBob Squarepants.

That said, dereferencing a null pointer will usually result in a crash. So your problem is probably memory corruption-related, e.g. from writing past the end of one of your strings. This can cause a delayed-effect crash. I'm particularly suspicious because it's highly unlikely that malloc(1) will fail unless your program is butting up against the end of its available virtual memory, and you would probably notice if that were the case.

Edit: OP pointed out that it isn't result that is null but information->kind.name->data. Here's a potential issue then:

There is no check for whether information->kind.name->data is null. The only check on that is

if (information->kind.name->data[information->kind.name->length] != '\0') {

Let's assume that information->kind.name->data is null, but information->kind.name->length is, say, 100. Then this statement is equivalent to:

if (*(information->kind.name->data + 100) != '\0') {

Which does not dereference NULL but rather dereferences address 100. If this does not crash, and address 100 happens to contain 0, then this test will pass.




回答3:


The effect of dereferencing the null pointer is undefined by standard as far as I know.

According to C Standard 6.5.3.2/4:

If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

So there could be crash or could be not.




回答4:


You may be experiencing stack corruption. The line of code you are refering to may not be being executed at all.




回答5:


My theory is that information->kind.name->length is a very large value so that information->kind.name->data[information->kind.name->length] is actually referring to a valid memory address.




回答6:


The act of dereferencing a NULL pointer is undefined by the standard. It is not guaranteed to crash and often times won't unless you actually try and write to the memory.




回答7:


As an FYI, when I see this line:

if (information->kind.name->data[information->kind.name->length] != '\0') {

I see up to three different pointer dereferences:

  1. information
  2. name
  3. data (if it's a pointer and not a fixed array)

You check information for non-null, but not name and not data. What makes you so sure that they're correct?

I also echo other sentiments here about something else possibly damaging your heap earlier. If you're running on windows, consider using gflags to do things like page allocation, which can be used to detect if you or someone else is writing past the end of a buffer and stepping on your heap.

Saw that you're on a Mac - ignore the gflags comment - it might help someone else who reads this. If you're running on something earlier than OS X, there are a number of handy Macsbugs tools to stress the heap (like the heap scramble command, 'hs').




回答8:


I'm interested in the char* cast in the call to strlcpy.

Could the type data* be different in size than the char* on your system? If char pointers are smaller you could get a subset of the data pointer which could be NULL.

Example:

int a = 0xffff0000;
short b = (short) a; //b could be 0 if lower bits are used

Edit: Spelling mistakes corrected.




回答9:


Here's one specific way you can get past the 'data' pointer being NULL in

if (information->kind.name->data[information->kind.name->length] != '\0') {

Say information->kind.name->length is large. Atleast larger than 4096, on a particular platform with a particular compiler (Say, most *nixes with a stock gcc compiler) the code will result in a memory read of "address of kind.name->data + information->kind.name->length].

At a lower level, that read is "read memory at address (0 + 8653)" (or whatever the length was). It's common on *nixes to mark the first page in the address space as "not accessible", meaning dereferencing a NULL pointer that reads memory address 0 to 4096 will result in a hardware trap being propagated to the application and crash it.

Reading past that first page, you might happen to poke into valid mapped memory, e.g. a shared library or something else that happened to be mapped there - and the memory access will not fail. And that's ok. Dereferencing a NULL pointer is undefined behavior, nothing requires it to fail.




回答10:


Missing '{' after last if statement means that something in the "// ... code above skipped, not relevant ..." section is controlling access to that entire fragment of code. Out of all the code pasted only the strlcpy is executed. Solution: never use if statements without curly brackets to clarify control.

Consider this...

if(false)
{
    if(something == stuff)
    {
        doStuff();

    .. snip ..

    if(monkey == blah)
        some->garbage= nothing;
        return -1;
    }
}
crash();

Only "crash();" gets executed.




回答11:


I would run your program under valgrind. You already know there's a problem with NULL pointers, so profile that code.

The advantage that valgrind beings here is that it checks every single pointer reference and checks to see if that memory location has been previously declared, and it will tell you the line number, structure, and anything else you care to know about memory.

As every one else mentioned, referencing the 0 memory location is a "que sera, sera" kinda thing.

My C tinged spidey sense is telling me that you should break out those structure walks on the

if (information->kind.name->data[information->kind.name->length] != '\0') {

line like

    if (information == NULL) {
      return -1; 
    }
    if (information->kind == NULL) {
      return -1; 
    }

and so on.




回答12:


Wow, thats strange. One thing does look slightly suspicious to me, though it may not contribute:

What would happen if information and data were good pointers (non null), but information.kind.name was null. You don't dereference this pointer until the strlcpy line, so if it was null, it might not crash until then. Of course, earlier than t hat you do dereference data[1] to set it to \0, which should also crash, but due to whatever fluke, your program may just happen to have write access to 0x01 but not 0x00.

Also, I see you use information->name.length in one place but information->kind.name.length in another, not sure if thats a typo or if thats desired.




回答13:


Despite the fact that dereferencing a null pointer leads to undefined behaviour and not necessarily to a crash, you should check the value of information->kind.name->data and not the contents of information->kind.name->data[1].




回答14:


char * p = NULL;

p[i] is like

p += i;

which is a valid operation, even on a nullpointer. it then points at memory location 0x0000[...]i




回答15:


You should always check whether information->kind.name->data is null anyway, but in this case

in

if (*result == NULL) 
    freeParsedData(information);
    return -1;
}

you have missed a {

it should be

if (*result == NULL)
{ 
     freeParsedData(information);
     return -1;
}

This is a good reason for this coding style, instead of

if (*result == NULL) { 
    freeParsedData(information);
    return -1;
}

where you might not spot the missing brace because you are used to the shape of the code block without the brace separating it from the if clause.




回答16:


*result = malloc(realLength); // ???

Address of newly allocated memory segment is stored at the location referenced by the address contained in the variable "result".

Is this the intent? If so, the strlcpy may need modification.




回答17:


As per my understanding, the special case of this problem is invalid access resulting with an attempt to read or write, using a Null pointer. Here the detection of the problem is very much hardware dependent. On some platforms, accessing memory for read or write using in NULL pointer will result in an exception.



来源:https://stackoverflow.com/questions/1334929/how-can-dereferencing-a-null-pointer-in-c-not-crash-a-program

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!