Why do C compilers prepend underscores to external names?

浪子不回头ぞ 提交于 2019-11-26 16:33:59

It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support

If the runtime support is provided by the compiler, you would think it would make more sense to prepend an underscore to the few external identifiers in the runtime support instead!

When C compilers first appeared, the basic alternative to programming in C on those platforms was programming in assembly language, and it was (and occasionally still is) useful to link together object files written in assembler and C. So really (IMHO) the leading underscore added to external C identifiers was to avoid clashes with the identifiers in your own assembly code.

(See also GCC's asm label extension; and note that this prepended underscore can be considered a simple form of name mangling. More complicated languages like C++ use more complicated name mangling, but this is where it started.)

if the c compiler always prepended an underscore before every symbol, then the startup/c-runtime code, (which is usually written in assembly) can safely use labels and symbols that do not start with an underscore, (such as the symbol 'start').

even if you write a start() function in the c code, it gets generated as _start in the object/asm output. (note that in this case, there is no possibility for the c code to generate a symbol that does not start with an underscore) so the startup coder doesnt have to worry about inventing obscure improbable symbols (like $_dontuse42%$) for each of his/her global variables/labels.

so the linker wont complain about a name clash, and the programmer is happy. :)

the following is different from the practise of the compiler prepending an underscore in its output formats.

This practice was later codified as part of the C and C++ language standards, in which the use of leading underscores was reserved for the implementation.

that is a convention followed, for the c sytem libraries and other system components. (and for things such as __FILE__ etc).

(note that such a symbol (ex: _time) may result in 2 leading underscores (__time) in the generated output)

From what I always hear it is to avoid naming conflicts. Not for other extern variables but more so that when you use a library it will hopefully not conflict with the user code variable names.

The main function is not the real entry point of an executable. Some statically linked files have the real entry point that eventually calls main, and those statically linked files own the namespace that does not start with an underscore. On my system, in /usr/lib, there are gcrt1.o, crt1.o and dylib1.o among others. Each of those has a "start" function without an underscore that will eventually call the "_main" entry point. Everything else besides those files has external scope. The history has to do with mixing assembler and C in a project, where all C was considered external.

From Wikipedia:

It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support. Furthermore, when the C/C++ compiler needed to introduce names into external linkage as part of the translation process, these names were often distinguished with some combination of multiple leading or trailing underscores.

This practice was later codified as part of the C and C++ language standards, in which the use of leading underscores was reserved for the implementation.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!