Why should the shebang line always be the first line?

这一生的挚爱 提交于 2019-11-28 07:32:28

The shebang must be the first line because it is interpreted by the kernel, which looks at the two bytes at the start of an executable file. If these are #! the rest of the line is interpreted as the executable to run and with the script file available to that program. (Details vary slightly, but that is the picture).

Since the kernel will only look at the first two characters and has no notion of further lines, you must place the hash bang in line 1.

Now what happens if the kernel can't execute a file beginning with #!whatever? The shell, attempting to fork an executable and being informed by the kernel that it can't execute the program, as a last resort attempts to interpret the file contents as a shell script. Since the shell is not perl, you get a bunch of errors, exactly the same as if you attempted to run

 sh < temp.pl

In addition to the explanations above, which are covered in detail here and here and here there's some special things about the #! and Perl which haven't been mentioned yet.

Perl reads the #! line and does two things. First, if the path doesn't look like perl, it will rexecute the program using that! For example...

#!/bin/sh

echo "Hello world!"

Will run correctly if executed as perl /path/to/that/program. I don't know for what historical reason Perl does this, but it comes in handy when you're testing multiple languages with Test::Harness.

The second thing is Perl finds any switches in the #! line and applies them just as if they were on the command line. This is why #!/usr/bin/perl -w works to turn on warnings.

It's worth mentioning that unlike the other parts of the shebang processing, this is all done inside Perl, not Unix, and so is portable to Windows.

Another Perl + shebang note is this madness you might find at the top of many Perl programs.

#!/usr/bin/perl

eval 'exec /usr/bin/perl -w -S $0 ${1+"$@"}'
    if 0; # not running under some shell

Sometimes, on very, very, very old systems, #! does not work and the Perl program is executed by the shell. The eval forces the the shell to first thing rexecute the file with Perl. Since shell statements end on newline it doesn't see the if 0. Perl does see the if 0, so it doesn't execute the eval. Both Perl and shell have syntactically equivalent eval operators which makes the hack work.

It's not just that it has to be the first line, the characters #! have to be the first two bytes in the file. That this can run scripts is a shell feature, not an OS one, and it's not specific to any particular scripting language.

When the system is told to execute the contents of a file, either with something like .../path/to/bin/program, or via the analogous route through the PATH, it examines the first few bytes of the file to look for the 'magic numbers' which reveal what type of file it is (you can peek at that process using the file(1) command). If it's a compiled binary, then it'll load and execute it in an appropriate manner, and if those first two bytes are #! it'll do the 'shebang-hack'.

The 'shebang-hack' is a special case that's employed by some shells (in fact, essentially every one, but it's convention rather than a requirement), in which the shell reads the remaining bytes up to a newline, interprets these as a filename, and then executes that file giving it the rest of the current file as input. Plus some details you can probably read about elsewhere.

Some (versions of) shells will allow quite long first lines, some allow only short ones; some allow multiple arguments, some allow only one.

If the file doesn't start with #!, but does appear to be text, some shells will heuristically try to execute it anyway. Csh (if I recall correctly) takes a punt on it being a csh-script, and there's some complicated and arcane case to do with some shells' behaviour if the first line is blank, which life is too short to remember.

There are interesting and extensive details (and accurate ones, in the sense that they match my recollections!) at Sven Mascheck's #! page.

At least on POSIX compliant systems, the shebang is used to tell the executable loader what to do with text files having the executable bit set.

The loader knowns what to do with binary files, they start with a "magic number", usually ELF related these days.

On the other hand, text files that do not have a shebang are executed by the POSIX compliant shell available on the machine, this is why you have these shell error messages:

use: Command not found.
use: Command not found.
print: Command not found.

When you executable is not to be interpreted by the POSIX compliant shell, you need to tell the loader what interpreter to use. Other OSes like Windows pick the file extension to figure it out but Unix doesn't use or care about extensions in this specific case. What it uses is the shebang on the first line which states what command interpreter to use. The only drawback is that the scripting language should ignore this first line. This is hopefully the case as # is a comment line prefix with most scripting languages.

Despite popular belief, portable scripts should not have a shebang at all. In particular #!/bin/sh is not recommended for them.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!