Slashes and dots in function names and prototypes?

后端 未结 4 685
迷失自我
迷失自我 2020-12-30 03:48

I\'m new to C and looking at Go\'s source tree I found this:

https://code.google.com/p/go/source/browse/src/pkg/runtime/race.c

void runtime∕race·Read         


        
相关标签:
4条回答
  • 2020-12-30 04:04

    The go compiler/runtime is compiled using the C compilers originally developed for plan9. When you build go from source, it'll first build the plan9 compilers, then use those to build Go.

    The plan9 compilers support unicode function names [1], and the Go developers use unicode characters in their function names as pseudo namespaces.

    [1] It looks like this might actually be standards compliant: g++ unicode variable name but gcc doesn't support unicode function/variable names.

    0 讨论(0)
  • 2020-12-30 04:09

    IMPORTANT UPDATE:

    The ultimate answer is certainly the one you got from Russ Cox, one of Go authors, on the golang-nuts mailing list. That said, I'm leaving some of my earlier notes below, they might help to understand some things.

    Also, from reading this answer linked above, I believe the "pseudo-slash" may now be translated to regular / slash too (like the middot is translated to dot) in newer versions of Go C compiler than the one I've tested below - but I don't have time to verify.


    The file is compiled by the Go Language Suite's internal C compiler, which originates in the Plan 9 C compiler(1)(2), and has some differences (mostly extensions, AFAIK) to the C standard.

    One of the extensions is, that it allows UTF-8 characters in identifiers.

    Now, in the Go Language Suite's C compiler, the middot character (·) is treated in a special way, as it is translated to a regular dot (.) in object files, which is interpreted by Go Language Suite's internal linker as namespace separator character.

    Example

    For the following file example.c (note: it must be saved as UTF-8 without BOM):

    void ·Bar1() {}
    void foo·bar2() {}
    void foo∕baz·bar3() {}
    

    the internal C compiler produces the following symbols:

    $ go tool 8c example.c
    $ go tool nm example.8
     T "".Bar1
     T foo.bar2
     T foo∕baz.bar3
    

    Now, please note I've given the ·Bar1() a capital B. This is because that way, I can make it visible to regular Go code - because it is translated to the exact same symbol as would result from compiling the following Go code:

    package example
    func Bar1() {}  // nm will show:  T "".Bar1
    

    Now, regarding the functions you named in the question, the story goes further down the rabbit hole. I'm a bit less sure if I'm right here, but I'll try to explain based on what I know. Thus, each sentence below this point should be read as if it had "AFAIK" written just at the end.

    So, the next missing piece needed to better understand this puzzle, is to know something more about the strange "" namespace, and how the Go suite's linker handles it. The "" namespace is what we might want to call an "empty" (because "" for a programmer means "an empty string") namespace, or maybe better, a "placeholder" namespace. And when the linker sees an import going like this:

    import examp "path/to/package/example"
    //...
    func main() {
        examp.Bar1()
    }
    

    then it takes the $GOPATH/pkg/.../example.a library file, and during import phase substitutes on the fly each "" with path/to/package/example. So now, in the linked program, we will see a symbol like this:

     T path/to/package/example.Bar1
    
    0 讨论(0)
  • 2020-12-30 04:15

    The "·" character is \xB7 according to my Javascript console. The "∕" character is \x2215.

    The dot falls within Annex D of the C99 standard lists which special characters which are valid as identifiers in C source. The slash doesn't seem to, so I suspect it's used as something else (perhaps namespacing) via a #define or preprocessor magic.

    That would explain why the dot is present in the actual function definition, but the slash is not.

    Edit: Check This Answer for some additional information. It's possible that the unicode slash is just allowed by GCC's implementation.

    0 讨论(0)
  • 2020-12-30 04:15

    It appears this is not standard C, nor C99. In particular, it both gcc and clang complain about the dot, even when in C99 mode.

    This source code is compiled by the Part 9 compiler suite (in particular, ./pkg/tool/darwin_amd64/6c on OS X), which is bootstrapped by the Go build system. According to this document, bottom of page 8, Plan 9 and its compiler do not use ASCII at all, but use Unicode instead. At bottom of page 9, it it stated that any character with a sufficiently high code point is considered valid for use in an identifier name.

    There's no pre-processing magic at all - the definition of functions do not match the declaration of functions simply because those are different functions. For example, void runtime∕race·Initialize(); is an external function whose definition appears in ./src/pkg/runtime/race/race.go; likewise for void runtime∕race·MapShadow(…).

    The function which appears later, void runtime·raceinit(void), is a completely different function, which is aparant by the fact it actually calls runtime∕race·Initialize();.

    0 讨论(0)
提交回复
热议问题