“life-time” of string literal in C

瘦欲@ 提交于 2019-12-16 22:12:10

问题


Wouldn't the pointer returned by the following function inaccessible?

char *foo( int rc ) 
{
    switch (rc) 
    {
      case 1:           return("one");
      case 2:           return("two");
      default:           return("whatever");
    }
}

So the lifetime of a local variable in C/C++ is practically only within the function, right? Which means, after char* foo(int) terminates, the pointer it returns no longer means anything?

I'm a bit confused about lifetime of local var. Could anyone give me a good clarification?


回答1:


Yes, lifetime of an local variable is within the scope({,}) in which it is created.
Local variables have automatic or local storage.
Automatic because they are automatically destroyed once the scope within which they are created ends.

However, What you have here is an string literal, which is allocated in an implementation defined read only memory. String literals are different from local variables and they remain alive throughout the program lifetime.They have static duration [Ref 1] lifetime.

A word of caution!
However, note that any attempt to modify the contents of an string literal is an Undefined Behavior. User programs are not allowed to modify contents of a string literal.
Hence, it is always encouraged to use a const while declaring a string literal.

const char*p = "string"; 

instead of,

char*p = "string";    

In fact, in C++ it is deprecated to declare a string literal without the const though not in c. However, declaring a string literal with a const gives you the advantage that compilers would usually give you a warning in case you attempt to modify the string literal in second case.

Sample program:

#include<string.h> 
int main() 
{ 
    char *str1 = "string Literal"; 
    const char *str2 = "string Literal"; 
    char source[]="Sample string"; 

    strcpy(str1,source);    //No warning or error just Uundefined Behavior 
    strcpy(str2,source);    //Compiler issues a warning 

    return 0; 
} 

Output:

cc1: warnings being treated as errors
prog.c: In function ‘main’:
prog.c:9: error: passing argument 1 of ‘strcpy’ discards qualifiers from pointer target type

Notice the compiler warns for the second case but not for the first.


EDIT: To answer the Q being asked by a couple of users here:

What is the deal with integral literals?
In other words is this code valid:

int *foo()
{
    return &(2);
} 

The answer is, No this code is not valid, it is ill-formed & will give an compiler error.
Something like:

prog.c:3: error: lvalue required as unary ‘&’ operand

String literals are l-values, i.e: You can take the address of an string literal but cannot change it's contents.
However, any other literals(int,float,char etc) are r-values(c standard uses the term the value of an expression for these) & their address cannot be taken at all.


[Ref 1]C99 standard 6.4.5/5 "String Literals - Semantics":

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; for wide string literals, the array elements have type wchar_t, and are initialized with the sequence of wide characters...

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.




回答2:


It's valid, string literals have static storage duration, so the pointer is not dangling.

For C, that is mandated in section 6.4.5, paragraph 6:

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.

And for C++ in section 2.14.5, paragraphs 8-11:

8 Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).

9 A string literal that begins with u, such as u"asdf", is a char16_t string literal. A char16_t string literal has type “array of n const char16_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters. A single c-char may produce more than one char16_t character in the form of surrogate pairs.

10 A string literal that begins with U, such as U"asdf", is a char32_t string literal. A char32_t string literal has type “array of n const char32_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters.

11 A string literal that begins with L, such as L"asdf", is a wide string literal. A wide string literal has type “array of n const wchar_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters.




回答3:


String literals are valid for the whole program (and are not allocated not the stack), so it will be valid.

Also, string literals are read-only, so (for good style) maybe you should change foo to const char *foo(int)




回答4:


Yes, it is valid code, case 1 below. You can safely return C strings from a function in at least these ways:

  • const char* to a string literal. Can't be modified, must not be freed by caller. Rarely useful for the purpose of returning a default value, because of the freeing problem described below. Might make sense if you actually need to pass a function pointer somewhere, so you need a function returning a string..

  • char* or const char* to static char buffer. Must not be freed by caller. Can be modified (either by caller if not const, or by the function returning it), but a function returning this can't (easily) have multiple buffers, so not (easily) thread safe, and caller may need to copy the returned value before calling the function again.

  • char* to a buffer allocated with malloc. Can be modified, but must usually be explicitly freed by caller, and has the heap allocation overhead. strdup is of this type.

  • const char* or char* to a buffer, which was passed as an argument to the function (returned pointer does not need to point to the first element of argument buffer). Leaves responsibility of buffer/memory management to caller. Many standard string functions are of this type.

One problem is, mixing these in one function can get complicated. Caller needs to know how it should handle the returned pointer, how long it is valid, and if caller should free it, and there's no (nice) way of determining that at runtime. So you can't for example have a function, which sometimes returns a pointer to a heap-allocated buffer which caller needs to free, and sometimes a pointer to a default value from string literal, which caller must not free.




回答5:


Good question. In general, you would be right, but your example is the exception. The compiler statically allocates global memory for a string literal. Therefore, the address returned by your function is valid.

That this is so is a rather convenient feature of C, isn't it? It allows a function to return a precomposed message without forcing the programmer to worry about the memory in which the message is stored.

See also @asaelr's correct observation re const.




回答6:


Local variables are only valid within the scope they're declared, however you don't declare any local variables in that function.

It's perfectly valid to return a pointer to a string literal from a function, as a string literal exists throughout the entire execution of the program, just as a static or a global variable would.

If you're worrying about what you're doing might be invalid undefined, you should turn up your compiler warnings to see if there is in fact anything you're doing wrong.




回答7:


str will never be dangling pointer. Because it points to static address where string literals resides . It will be mostly readonly and global to the program when it will be loaded . Even if you try to free or modify ,it will throw segmentation fault on platforms with memory protection .




回答8:


A local variable is allocated on the stack. After the function finishes, the variable goes out of scope and is no longer accessible in the code. However, if you have a global (or simply - not yet out of scope) pointer that you assigned to point to that variable, it will point to the place in the stack where that variable was. It could be a value used by another function, or a meaningless value.




回答9:


In the above example shown by you, you are actually returning the allocated pointers to whatever function that calls the above. So It would not become a local pointer. And moreover the pointers that are needed to be returned, memory is allocated in global segment.

Thanking You,

Viharri P L V.



来源:https://stackoverflow.com/questions/9970295/life-time-of-string-literal-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!