Slashes and dots in function names and prototypes?

天涯浪子 提交于 2019-12-18 12:15:02

问题


I'm new to C and looking at Go's source tree I found this:

https://code.google.com/p/go/source/browse/src/pkg/runtime/race.c

void runtime∕race·Read(int32 goid, void *addr, void *pc);
void runtime∕race·Write(int32 goid, void *addr, void *pc);

void
runtime·raceinit(void)
{
    // ...
}

What do the slashes and dots (·) mean? Is this valid C?


回答1:


IMPORTANT UPDATE:

The ultimate answer is certainly the one you got from Russ Cox, one of Go authors, on the golang-nuts mailing list. That said, I'm leaving some of my earlier notes below, they might help to understand some things.

Also, from reading this answer linked above, I believe the "pseudo-slash" may now be translated to regular / slash too (like the middot is translated to dot) in newer versions of Go C compiler than the one I've tested below - but I don't have time to verify.


The file is compiled by the Go Language Suite's internal C compiler, which originates in the Plan 9 C compiler(1)(2), and has some differences (mostly extensions, AFAIK) to the C standard.

One of the extensions is, that it allows UTF-8 characters in identifiers.

Now, in the Go Language Suite's C compiler, the middot character (·) is treated in a special way, as it is translated to a regular dot (.) in object files, which is interpreted by Go Language Suite's internal linker as namespace separator character.

Example

For the following file example.c (note: it must be saved as UTF-8 without BOM):

void ·Bar1() {}
void foo·bar2() {}
void foo∕baz·bar3() {}

the internal C compiler produces the following symbols:

$ go tool 8c example.c
$ go tool nm example.8
 T "".Bar1
 T foo.bar2
 T foo∕baz.bar3

Now, please note I've given the ·Bar1() a capital B. This is because that way, I can make it visible to regular Go code - because it is translated to the exact same symbol as would result from compiling the following Go code:

package example
func Bar1() {}  // nm will show:  T "".Bar1

Now, regarding the functions you named in the question, the story goes further down the rabbit hole. I'm a bit less sure if I'm right here, but I'll try to explain based on what I know. Thus, each sentence below this point should be read as if it had "AFAIK" written just at the end.

So, the next missing piece needed to better understand this puzzle, is to know something more about the strange "" namespace, and how the Go suite's linker handles it. The "" namespace is what we might want to call an "empty" (because "" for a programmer means "an empty string") namespace, or maybe better, a "placeholder" namespace. And when the linker sees an import going like this:

import examp "path/to/package/example"
//...
func main() {
    examp.Bar1()
}

then it takes the $GOPATH/pkg/.../example.a library file, and during import phase substitutes on the fly each "" with path/to/package/example. So now, in the linked program, we will see a symbol like this:

 T path/to/package/example.Bar1



回答2:


The "·" character is \xB7 according to my Javascript console. The "∕" character is \x2215.

The dot falls within Annex D of the C99 standard lists which special characters which are valid as identifiers in C source. The slash doesn't seem to, so I suspect it's used as something else (perhaps namespacing) via a #define or preprocessor magic.

That would explain why the dot is present in the actual function definition, but the slash is not.

Edit: Check This Answer for some additional information. It's possible that the unicode slash is just allowed by GCC's implementation.




回答3:


It appears this is not standard C, nor C99. In particular, it both gcc and clang complain about the dot, even when in C99 mode.

This source code is compiled by the Part 9 compiler suite (in particular, ./pkg/tool/darwin_amd64/6c on OS X), which is bootstrapped by the Go build system. According to this document, bottom of page 8, Plan 9 and its compiler do not use ASCII at all, but use Unicode instead. At bottom of page 9, it it stated that any character with a sufficiently high code point is considered valid for use in an identifier name.

There's no pre-processing magic at all - the definition of functions do not match the declaration of functions simply because those are different functions. For example, void runtime∕race·Initialize(); is an external function whose definition appears in ./src/pkg/runtime/race/race.go; likewise for void runtime∕race·MapShadow(…).

The function which appears later, void runtime·raceinit(void), is a completely different function, which is aparant by the fact it actually calls runtime∕race·Initialize();.




回答4:


The go compiler/runtime is compiled using the C compilers originally developed for plan9. When you build go from source, it'll first build the plan9 compilers, then use those to build Go.

The plan9 compilers support unicode function names [1], and the Go developers use unicode characters in their function names as pseudo namespaces.

[1] It looks like this might actually be standards compliant: g++ unicode variable name but gcc doesn't support unicode function/variable names.



来源:https://stackoverflow.com/questions/13475908/slashes-and-dots-in-function-names-and-prototypes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!