python asyncio gets deadlock if multiple stdin input is needed

前端 未结 3 1635
旧时难觅i
旧时难觅i 2021-02-20 10:06

I wrote a command-line tool to execute git pull for multiple git repos using python asyncio. It works fine if all repos have ssh password-less login setup. It also

3条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-02-20 11:10

    In the default configuration, when a username or password is needed git will directly access the /dev/tty synonym for better control over the 'controlling' terminal device, e.g. the device that lets you interact with the user. Since subprocesses by default inherit the controlling terminal from their parent, all the git processes you start are going to access the same TTY device. So yes, they'll hang when trying to read from and write to the same TTY with processes clobbering each other's expected input.

    A simplistic method to prevent this from happening would be to give each subprocess its own session; different sessions each have a different controlling TTY. Do so by setting start_new_session=True:

    process = await asyncio.create_subprocess_exec(
        *cmds, stdout=asyncio.subprocess.PIPE, cwd=path, start_new_session=True)
    

    You can't really determine up-front what git commands might require user credentials, because git can be configured to get credentials from a whole range of locations, and these are only used if the remote repository actually challenges for authentication.

    Even worse, for ssh:// remote URLs, git doesn't handle the authentication at all, but leaves it to the ssh client process it opens. More on that below.

    How Git asks for credentials (for anything but ssh) is configurable however; see the gitcredentials documentation. You could make use of this if your code must be able to forward credentials requests to an end-user. I'd not leave it to the git commands to do this via a terminal, because how will the user know what specific git command is going to receive what credentials, let alone the issues you'd have with making sure the prompts arrive in a logical order.

    Instead, I'd route all requests for credentials through your script. You have two options to do this with:

    • Set the GIT_ASKPASS environment variable, pointing to an executable that git should run for each prompt.

      This executable is called with a single argument, the prompt to show the user. It is called separately for each piece of information needed for a given credential, so for a username (if not already known), and a password. The prompt text should make it clear to the user what is being asked for (e.g. "Username for 'https://github.com': " or "Password for 'https://someusername@github.com': ".

    • Register a credential helper; this is executed as a shell command (so can have its own pre-configured command-line arguments), and one extra argument telling the helper what kind of operation is expected of it. If it is passed get as the last argument, then it is asked to provide credentials for a given host and protocol, or it can be told that certain credentials were successful with store, or were rejected with erase. In all cases it can read information from stdin to learn what host git is trying to authenticate to, in multi-line key=value format.

      So with a credential helper, you get to prompt for a username and password combination together as a single step, and you also get more information about the process; handling store and erase operations lets you cache credentials more effectively.

    Git fill first ask each configured credential helper, in config order (see the FILES section to understand how the 4 config file locations are processed in order). You can add a new one-off helper configuration on the git command line with the -c credential.helper=... command-line switch, which is added to the end. If no credential helper was able to fill in a missing username or password, then the user is prompted with GIT_ASKPASS or the other prompting options.

    For SSH connections, git creates a new ssh child process. SSH will then handle authentication, and could ask the user for credentials, or for ssh keys, ask the user for a passphrase. This again will be done via /dev/tty, and SSH is more stubborn about this. While you can set a SSH_ASKPASS environment variable to a binary to be used for prompting, SSH will only use this if there is no TTY session and DISPLAY is also set.

    SSH_ASKPASS must be an executable (so no passing in arguments), and you won't be notified of the success or failure of the prompted credentials.

    I'd also make sure to copy the current environment variables to the child processes, because if the user has set up an SSH key agent to cache ssh keys, you'd want the SSH processes that git starts to make use of them; a key agent is discovered through environment variables.

    So, to create the connection for a credential helper, and one that also works for SSH_ASKPASS, you can use a simple synchronous script that takes the socket from an environment variable:

    #!/path/to/python3
    import os, socket, sys
    path = os.environ['PROMPTING_SOCKET_PATH']
    operation = sys.argv[1]
    if operation not in {'get', 'store', 'erase'}:
        operation, params = 'prompt', f'prompt={operation}\n'
    else:
        params = sys.stdin.read()
    with socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) as s:
        s.connect(path)
        s.sendall(f'''operation={operation}\n{params}'''.encode())
        print(s.recv(2048).decode())
    

    This should have the executable bit set.

    This then could be passed to a git command as a temporary file or included pre-built, and you add a Unix domain socket path in the PROMPTING_SOCKET_PATH environment variable. It can double as a SSH_ASKPASS prompter, setting the operation to prompt.

    This script then makes both SSH and git ask your UNIX domain socket server for user credentials, in a separate connection per user. I've used a generous receiving buffer size, I don't think you'll ever run into an exchange with this protocol that'll exceed it, nor do I see any reason for it to be under-filled. It keeps the script nice and simple.

    You could instead use it as the GIT_ASKPASS command, but then you wouldn't get valuable information on the success of credentials for non-ssh connections.

    Here is a demo implementation of a UNIX domain socket server that handles git and credential requests from the above credential helper, one that just generates random hex values rather than ask a user:

    import asyncio
    import os
    import secrets
    import tempfile
    
    async def handle_git_prompt(reader, writer):
        data = await reader.read(2048)
        info = dict(line.split('=', 1) for line in data.decode().splitlines())
        print(f"Received credentials request: {info!r}")
    
        response = []
        operation = info.pop('operation', 'get')
    
        if operation == 'prompt':
            # new prompt for a username or password or pass phrase for SSH
            password = secrets.token_hex(10)
            print(f"Sending prompt response: {password!r}")
            response.append(password)
    
        elif operation == 'get':
            # new request for credentials, for a username (optional) and password
            if 'username' not in info:
                username = secrets.token_hex(10)
                print(f"Sending username: {username!r}")
                response.append(f'username={username}\n')
    
            password = secrets.token_hex(10)
            print(f"Sending password: {password!r}")
            response.append(f'password={password}\n')
    
        elif operation == 'store':
            # credentials were used successfully, perhaps store these for re-use
            print(f"Credentials for {info['username']} were approved")
    
        elif operation == 'erase':
            # credentials were rejected, if we cached anything, clear this now.
            print(f"Credentials for {info['username']} were rejected")
    
        writer.write(''.join(response).encode())
        await writer.drain()
    
        print("Closing the connection")
        writer.close()
        await writer.wait_closed()
    
    async def main():
        with tempfile.TemporaryDirectory() as dirname:
            socket_path = os.path.join(dirname, 'credential.helper.sock')
            server = await asyncio.start_unix_server(handle_git_prompt, socket_path)
    
            print(f'Starting a domain socket at {server.sockets[0].getsockname()}')
    
            async with server:
                await server.serve_forever()
    
    asyncio.run(main())
    

    Note that a credential helper could also add quit=true or quit=1 to the output to tell git to not look for any other credential helpers and no further prompting.

    You can use the git credential command to test out that the credential helper works, by passing in the helper script (/full/path/to/credhelper.py) with the git -c credential.helper=... command-line option. git credential can take a url=... string on standard input, it'll parse this out just like git would to contact the credential helpers; see the documentation for the full exchange format specification.

    First, start the above demo script in a separate terminal:

    $ /usr/local/bin/python3.7 git-credentials-demo.py
    Starting a domain socket at /tmp/credhelper.py /var/folders/vh/80414gbd6p1cs28cfjtql3l80000gn/T/tmprxgyvecj/credential.helper.sock
    

    and then try to get credentials from it; I included a demonstration of the store and erase operations too:

    $ export PROMPTING_SOCKET_PATH="/var/folders/vh/80414gbd6p1cs28cfjtql3l80000gn/T/tmprxgyvecj/credential.helper.sock"
    $ CREDHELPER="/tmp/credhelper.py"
    $ echo "url=https://example.com:4242/some/path.git" | git -c "credential.helper=$CREDHELPER" credential fill
    protocol=https
    host=example.com:4242
    username=5b5b0b9609c1a4f94119
    password=e259f5be2c96fed718e6
    $ echo "url=https://someuser@example.com/some/path.git" | git -c "credential.helper=$CREDHELPER" credential fill
    protocol=https
    host=example.com
    username=someuser
    password=766df0fba1de153c3e99
    $ printf "protocol=https\nhost=example.com:4242\nusername=5b5b0b9609c1a4f94119\npassword=e259f5be2c96fed718e6" | git -c "credential.helper=$CREDHELPER" credential approve
    $ printf "protocol=https\nhost=example.com\nusername=someuser\npassword=e259f5be2c96fed718e6" | git -c "credential.helper=$CREDHELPER" credential reject
    

    and when you then look at the output from the example script, you'll see:

    Received credentials request: {'operation': 'get', 'protocol': 'https', 'host': 'example.com:4242'}
    Sending username: '5b5b0b9609c1a4f94119'
    Sending password: 'e259f5be2c96fed718e6'
    Closing the connection
    Received credentials request: {'operation': 'get', 'protocol': 'https', 'host': 'example.com', 'username': 'someuser'}
    Sending password: '766df0fba1de153c3e99'
    Closing the connection
    Received credentials request: {'operation': 'store', 'protocol': 'https', 'host': 'example.com:4242', 'username': '5b5b0b9609c1a4f94119', 'password': 'e259f5be2c96fed718e6'}
    Credentials for 5b5b0b9609c1a4f94119 were approved
    Closing the connection
    Received credentials request: {'operation': 'erase', 'protocol': 'https', 'host': 'example.com', 'username': 'someuser', 'password': 'e259f5be2c96fed718e6'}
    Credentials for someuser were rejected
    Closing the connection
    

    Note how the helper is given a parsed-out set of fields, for protocol and host, and the path is omitted; if you set the git config option credential.useHttpPath=true (or it has already been set for you) then path=some/path.git will be added to the information being passed in.

    For SSH, the executable is simply called with a prompt to display:

    $ $CREDHELPER "Please enter a super-secret passphrase: "
    30b5978210f46bb968b2
    

    and the demo server has printed:

    Received credentials request: {'operation': 'prompt', 'prompt': 'Please enter a super-secret passphrase: '}
    Sending prompt response: '30b5978210f46bb968b2'
    Closing the connection
    

    Just make sure to still set start_new_session=True when starting the git processes to ensure that SSH is forced to use SSH_ASKPASS.

    env = {
        os.environ,
        SSH_ASKPASS='../path/to/credhelper.py',
        DISPLAY='dummy value',
        PROMPTING_SOCKET_PATH='../path/to/domain/socket',
    }
    process = await asyncio.create_subprocess_exec(
        *cmds, stdout=asyncio.subprocess.PIPE, cwd=path, 
        start_new_session=True, env=env)
    

    Of course, how you then handle prompting your users is a separate issue, but your script now has full control (each git command will wait patiently for the credential helper to return the requested information) and you can queue up requests for the user to fill in, and you can cache credentials as needed (in case multiple commands are all waiting for credentials for the same host).

提交回复
热议问题