It seems to me like the files run the same without that line.
Considering the portability issues between python2
and python3
, you should always specify either version unless your program is compatible with both.
Some distributions are shipping python
symlinked to python3
for a while now - do not rely on python
being python2
.
This is emphasized by PEP 394:
In order to tolerate differences across platforms, all new code that needs to invoke the Python interpreter should not specify python, but rather should specify either python2 or python3 (or the more specific python2.x and python3.x versions; see the Migration Notes). This distinction should be made in shebangs, when invoking from a shell script, when invoking via the system() call, or when invoking in any other context.
The exec
system call of the Linux kernel understands shebangs (#!
) natively
When you do on bash:
./something
on Linux, this calls the exec
system call with the path ./something
.
This line of the kernel gets called on the file passed to exec
: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25
if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))
It reads the very first bytes of the file, and compares them to #!
.
If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another exec
call with path /usr/bin/env python
and current file as the first argument:
/usr/bin/env python /path/to/script.py
and this works for any scripting language that uses #
as a comment character.
And yes, you can make an infinite loop with:
printf '#!/a\n' | sudo tee /a
sudo chmod +x /a
/a
Bash recognizes the error:
-bash: /a: /a: bad interpreter: Too many levels of symbolic links
#!
just happens to be human readable, but that is not required.
If the file started with different bytes, then the exec
system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes 7f 45 4c 46
(which also happens to be human readable for .ELF
). Let's confirm that by reading the 4 first bytes of /bin/ls
, which is an ELF executable:
head -c 4 "$(which ls)" | hd
output:
00000000 7f 45 4c 46 |.ELF|
00000004
So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?
Finally, you can add your own shebang handlers with the binfmt_misc
mechanism. For example, you can add a custom handler for .jar files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.
I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.
PATH
search motivation
Likely, one big motivation for the existence of shebangs is the fact that in Linux, we often want to run commands from PATH
just as:
basename-of-command
instead of:
/full/path/to/basename-of-command
But then, without the shebang mechanism, how would Linux know how to launch each type of file?
Hardcoding the extension in commands:
basename-of-command.py
or implementing PATH search on every interpreter:
python basename-of-command
would be a possibility, but this has the major problem that everything breaks if we ever decide to refactor the command into another language.
Shebangs solve this problem beautifully.
It allows you to select the executable that you wish to use; which is very handy if perhaps you have multiple python installs, and different modules in each and wish to choose. e.g.
#!/bin/sh
#
# Choose the python we need. Explanation:
# a) '''\' translates to \ in shell, and starts a python multi-line string
# b) "" strings are treated as string concat by python, shell ignores them
# c) "true" command ignores its arguments
# c) exit before the ending ''' so the shell reads no further
# d) reset set docstrings to ignore the multiline comment code
#
"true" '''\'
PREFERRED_PYTHON=/Library/Frameworks/Python.framework/Versions/2.7/bin/python
ALTERNATIVE_PYTHON=/Library/Frameworks/Python.framework/Versions/3.6/bin/python3
FALLBACK_PYTHON=python3
if [ -x $PREFERRED_PYTHON ]; then
echo Using preferred python $ALTERNATIVE_PYTHON
exec $PREFERRED_PYTHON "$0" "$@"
elif [ -x $ALTERNATIVE_PYTHON ]; then
echo Using alternative python $ALTERNATIVE_PYTHON
exec $ALTERNATIVE_PYTHON "$0" "$@"
else
echo Using fallback python $FALLBACK_PYTHON
exec python3 "$0" "$@"
fi
exit 127
'''
__doc__ = """What this file does"""
print(__doc__)
import platform
print(platform.python_version())