Formatting a command in python subprocess popen

前端 未结 2 684
无人及你
无人及你 2020-12-07 00:00

I am trying to format the following awk command

awk -v OFS=\"\\t\" \'{printf \"chr%s\\t%s\\t%s\\n\", $1, $2-1, $2}\' file1.txt > file2.txt
相关标签:
2条回答
  • 2020-12-07 00:34
    1. The simplest method, especially if you wish to keep the output redirection stuff, is to use subprocess with shell=True - then you only need to escape Python special characters. The line, as a whole, will be interpreted by the default shell.

      • WARNING: do not use this with untrusted input without sanitizing it first!
    2. Alternatively, you can replace the command line with an argv-type sequence and feed that to subprocess instead. Then, you need to provide stuff as the program would see it:

      • remove all the shell-level escaping
      • remove the output redirection stuff and do the redirection yourself instead

    Regarding the specific problems:

    • you didn't escape Python special characters in the string so \t and \n became the literal tab and newline (try to print awk_command)
    • using shlex.split is nothing different from shell=True - with an added unreliability since it cannot guarantee if would parse the string the same way your shell would in every case (not to mention the lack of transmutations the shell makes).

      • Specifically, it doesn't know or care about the special meaning of the redirection part:

        >>> awk_command = """awk -v OFS="\\t" '{printf "chr%s\\t%s\\t%s\\n", $1, $2- 1, $2}' file1.txt > file2.txt"""
        >>> shlex.split(awk_command)
        ['awk','-v','OFS=\\t','{printf "chr%s\\t%s\\t%s\\n", $1, $2-1, $2}','file1.txt','>','file2.txt']
        

    So, if you wish to use shell=False, do construct the argument list yourself.

    0 讨论(0)
  • 2020-12-07 00:42

    > is the shell redirection operator. To implement it in Python, use stdout parameter:

    #!/usr/bin/env python
    import shlex
    import subprocess
    
    cmd = r"""awk -v OFS="\t" '{printf "chr%s\t%s\t%s\n", $1, $2-1, $2}'"""
    with open('file2.txt', 'wb', 0) as output_file:
        subprocess.check_call(shlex.split(cmd) + ["file1.txt"], stdout=output_file)
    

    To avoid starting a separate process, you could implement this particular awk command in pure Python.

    0 讨论(0)
提交回复
热议问题