Why do people write #!/usr/bin/env python on the first line of a Python script?

后端 未结 21 1760
刺人心
刺人心 2020-11-21 06:16

It seems to me like the files run the same without that line.

相关标签:
21条回答
  • 2020-11-21 07:16

    Considering the portability issues between python2 and python3, you should always specify either version unless your program is compatible with both.

    Some distributions are shipping python symlinked to python3 for a while now - do not rely on python being python2.

    This is emphasized by PEP 394:

    In order to tolerate differences across platforms, all new code that needs to invoke the Python interpreter should not specify python, but rather should specify either python2 or python3 (or the more specific python2.x and python3.x versions; see the Migration Notes). This distinction should be made in shebangs, when invoking from a shell script, when invoking via the system() call, or when invoking in any other context.

    0 讨论(0)
  • 2020-11-21 07:17

    The exec system call of the Linux kernel understands shebangs (#!) natively

    When you do on bash:

    ./something
    

    on Linux, this calls the exec system call with the path ./something.

    This line of the kernel gets called on the file passed to exec: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25

    if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))
    

    It reads the very first bytes of the file, and compares them to #!.

    If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another exec call with path /usr/bin/env python and current file as the first argument:

    /usr/bin/env python /path/to/script.py
    

    and this works for any scripting language that uses # as a comment character.

    And yes, you can make an infinite loop with:

    printf '#!/a\n' | sudo tee /a
    sudo chmod +x /a
    /a
    

    Bash recognizes the error:

    -bash: /a: /a: bad interpreter: Too many levels of symbolic links
    

    #! just happens to be human readable, but that is not required.

    If the file started with different bytes, then the exec system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes 7f 45 4c 46 (which also happens to be human readable for .ELF). Let's confirm that by reading the 4 first bytes of /bin/ls, which is an ELF executable:

    head -c 4 "$(which ls)" | hd 
    

    output:

    00000000  7f 45 4c 46                                       |.ELF|
    00000004                                                                 
    

    So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?

    Finally, you can add your own shebang handlers with the binfmt_misc mechanism. For example, you can add a custom handler for .jar files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.

    I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.

    PATH search motivation

    Likely, one big motivation for the existence of shebangs is the fact that in Linux, we often want to run commands from PATH just as:

    basename-of-command
    

    instead of:

    /full/path/to/basename-of-command
    

    But then, without the shebang mechanism, how would Linux know how to launch each type of file?

    Hardcoding the extension in commands:

     basename-of-command.py
    

    or implementing PATH search on every interpreter:

    python basename-of-command
    

    would be a possibility, but this has the major problem that everything breaks if we ever decide to refactor the command into another language.

    Shebangs solve this problem beautifully.

    0 讨论(0)
  • 2020-11-21 07:18

    It allows you to select the executable that you wish to use; which is very handy if perhaps you have multiple python installs, and different modules in each and wish to choose. e.g.

    #!/bin/sh
    #
    # Choose the python we need. Explanation:
    # a) '''\' translates to \ in shell, and starts a python multi-line string
    # b) "" strings are treated as string concat by python, shell ignores them
    # c) "true" command ignores its arguments
    # c) exit before the ending ''' so the shell reads no further
    # d) reset set docstrings to ignore the multiline comment code
    #
    "true" '''\'
    PREFERRED_PYTHON=/Library/Frameworks/Python.framework/Versions/2.7/bin/python
    ALTERNATIVE_PYTHON=/Library/Frameworks/Python.framework/Versions/3.6/bin/python3
    FALLBACK_PYTHON=python3
    
    if [ -x $PREFERRED_PYTHON ]; then
        echo Using preferred python $ALTERNATIVE_PYTHON
        exec $PREFERRED_PYTHON "$0" "$@"
    elif [ -x $ALTERNATIVE_PYTHON ]; then
        echo Using alternative python $ALTERNATIVE_PYTHON
        exec $ALTERNATIVE_PYTHON "$0" "$@"
    else
        echo Using fallback python $FALLBACK_PYTHON
        exec python3 "$0" "$@"
    fi
    exit 127
    '''
    
    __doc__ = """What this file does"""
    print(__doc__)
    import platform
    print(platform.python_version())
    
    0 讨论(0)
提交回复
热议问题