What is “namespace cleanliness”, and how does glibc achieve it?

前端 未结 3 783
说谎
说谎 2021-02-09 00:40

I came across this paragraph from this answer by @zwol recently:

The __libc_ prefix on read is because there are actually three

3条回答
  •  眼角桃花
    2021-02-09 01:39

    OK, first some basics about the C language as specified by the standard. In order that you can write C applications without concern that some of the identifiers you use might clash with external identifiers used in the implementation of the standard library or with macros, declarations, etc. used internally in the standard headers, the language standard splits up possible identifiers into namespaces reserved for the implementation and namespaces reserved for the application. The relevant text is:

    7.1.3 Reserved identifiers

    Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.

    • All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
    • All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
    • Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; unless explicitly stated otherwise (see 7.1.4).
    • All identifiers with external linkage in any of the following subclauses (including the future library directions) and errno are always reserved for use as identifiers with external linkage.184)
    • Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included.

    No other identifiers are reserved. If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.

    Emphasis here is mine. As examples, the identifier read is reserved for the application in all contexts ("no other..."), but the identifier __read is reserved for the implementation in all contexts (bullet point 1).

    Now, POSIX defines a lot of interfaces that are not part of the standard C language, and libc implementations might have a good deal more not covered by any standards. That's okay so far, assuming the tooling (linker) handles it correctly. If the application doesn't include (outside the scope of the language standard), it can safely use the identifier read for any purpose it wants, and nothing breaks even though libc contains an identifier named read.

    The problem is that a libc for a unix-like system is also going to want to use the function read to implement parts of the base C language's standard library, like fgetc (and all the other stdio functions built on top of it). This is a problem, because now you can have a strictly conforming C program such as:

    #include 
    #include 
    void read()
    {
        abort();
    }
    int main()
    {
        getchar();
        return 0;
    }
    

    and, if libc's stdio implementation is calling read as its backend, it will end up calling the application's function (not to mention, with the wrong signature, which could break/crash for other reasons), producing the wrong behavior for a simple, strictly conforming program.

    The solution here is for libc to have an internal function named __read (or whatever other name in the reserved namespace you like) that can be called to implement stdio, and have the public read function call that (or, be a weak alias for it, which is a more efficient and more flexible mechanism to achieve the same thing with traditional unix linker semantics; note that there are some namespace issues more complex than read that can't be solved without weak aliases).

提交回复
热议问题