Why does calling std::string.c_str() on a function that returns a string not work?

后端 未结 3 1862
说谎
说谎 2020-11-30 09:01

I have the following code:

std::string getString() {
    std::string str(\"hello\");
    return str;
}

int main() {
    const char* cStr = getString().c_str         


        
相关标签:
3条回答
  • 2020-11-30 09:25

    getString() would return a copy of str (getString() returns by value);

    It's right.

    thus, the copy of str would stay "alive" in main() until main() returns.

    No, the returned copy is a temporary std::string, which will be destroyed at the end of the statement in which it was created, i.e. before std::cout << cStr << std::endl;. Then cStr becomes dangled, dereference on it leads to UB, anything is possible.

    You can copy the returned temporary to a named variable, or bind it to a const lvalue-reference or rvalue-reference (the lifetime of the temporary will be extended until the reference goes out of scope). Such as:

    std::string s1 = getString();    // s1 will be copy initialized from the temporary
    const char* cStr1 = s1.c_str();
    std::cout << cStr1 << std::endl; // safe
    
    const std::string& s2 = getString(); // lifetime of temporary will be extended when bound to a const lvalue-reference
    const char* cStr2 = s2.c_str();
    std::cout << cStr2 << std::endl; // safe
    
    std::string&& s3 = getString();  // similar with above
    const char* cStr3 = s3.c_str();
    std::cout << cStr3 << std::endl; // safe
    

    Here is an explanation from [The.C++.Programming.Language.Special.Edition] 10.4.10 Temporary Objects [class.temp]]:

    Unless bound to a reference or used to initialize a named object, a temporary object is destroyed at the end of the full expression in which it was created. A full expression is an expression that is not a subexpression of some other expression.

    The standard string class has a member function c_str() that returns a C-style, zero-terminated array of characters (§3.5.1, §20.4.1). Also, the operator + is defined to mean string concatenation. These are very useful facilities for strings . However, in combination they can cause obscure problems. For example:

    void f(string& s1, string& s2, string& s3)
    {
    
        const char* cs = (s1 + s2).c_str();
        cout << cs ;
        if (strlen(cs=(s2+s3).c_str())<8 && cs[0]==´a´) {
            // cs used here
        }
    
    }
    

    Probably, your first reaction is "but don’t do that," and I agree. However, such code does get written, so it is worth knowing how it is interpreted.

    A temporary object of class string is created to hold s1 + s2 . Next, a pointer to a C-style string is extracted from that object. Then – at the end of the expression – the temporary object is deleted. Now, where was the C-style string allocated? Probably as part of the temporary object holding s1 + s2 , and that storage is not guaranteed to exist after that temporary is destroyed. Consequently, cs points to deallocated storage. The output operation cout << cs might work as expected, but that would be sheer luck. A compiler can detect and warn against many variants of this problem.

    0 讨论(0)
  • 2020-11-30 09:40

    Problem here is that you are returning a temporary variable and over that temporary variable you are doing c_str function.

    "c_str() function Returns a pointer to an array that contains a null-terminated sequence of characters (i.e., a C-string) representing the current value of the string object( [http://www.cplusplus.com/reference/string/string/c_str/][1]).

    In this case your pointer is pointing to memory location which is now not present.

    std::string getString() {
            std::string str("hello");
            return str; // Will create Temporary object as it's return by value}
    
        int main() {
             const char* cStr = getString().c_str(); // Temporary object is destroyed
            std::cout << cStr << std::endl; // this prints garbage }
    

    Solution is to copy your temporary object to memory location properly(by creating local copy) and then use c_str over that object.

    0 讨论(0)
  • 2020-11-30 09:46

    As mentioned by others you are using a pointer to temporary after it has already been deleted - this is a classic example of heap after free use.

    What I can add to others' answers is that you can easily detect such usage with gcc's or clang's address sanitizers.

    Example:

    #include <string>
    #include <iostream>
    
    std::string get()
    {
      return "hello";
    }
    
    int main()
    {
      const char* c = get().c_str();
      std::cout << c << std::endl;
    }
    

    sanitizer output:

    =================================================================
    ==2951==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300000eff8 at pc 0x7f78e27869bb bp 0x7fffc483e670 sp 0x7fffc483de20
    READ of size 6 at 0x60300000eff8 thread T0
        #0 0x7f78e27869ba in strlen (/usr/lib64/libasan.so.2+0x6d9ba)
        #1 0x39b4892ba0 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) (/usr/lib64/libstdc++.so.6+0x39b4892ba0)
        #2 0x400dd8 in main /tmp/tmep_string/main.cpp:12
        #3 0x39aa41ed5c in __libc_start_main (/lib64/libc.so.6+0x39aa41ed5c)
        #4 0x400c48  (/tmp/tmep_string/a.out+0x400c48)
    
    0x60300000eff8 is located 24 bytes inside of 30-byte region [0x60300000efe0,0x60300000effe)
    freed by thread T0 here:
        #0 0x7f78e27ae6ea in operator delete(void*) (/usr/lib64/libasan.so.2+0x956ea)
        #1 0x39b489d4c8 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (/usr/lib64/libstdc++.so.6+0x39b489d4c8)
        #2 0x39aa41ed5c in __libc_start_main (/lib64/libc.so.6+0x39aa41ed5c)
    
    previously allocated by thread T0 here:
        #0 0x7f78e27ae1aa in operator new(unsigned long) (/usr/lib64/libasan.so.2+0x951aa)
        #1 0x39b489c3c8 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (/usr/lib64/libstdc++.so.6+0x39b489c3c8)
        #2 0x400c1f  (/tmp/tmep_string/a.out+0x400c1f)
    
    SUMMARY: AddressSanitizer: heap-use-after-free ??:0 strlen
    Shadow bytes around the buggy address:
      0x0c067fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x0c067fff9df0: fa fa fa fa fa fa fa fa fa fa fa fa fd fd fd[fd]
      0x0c067fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07 
      Heap left redzone:       fa
      Heap right redzone:      fb
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack partial redzone:   f4
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
    ==2951==ABORTING
    
    0 讨论(0)
提交回复
热议问题