use generator as subprocess input; got “I/O operation on closed file” exception

。_饼干妹妹 提交于 2020-01-05 02:52:27

问题


I have a large file that needs to be processed before feeding to another command. I could save the processed data as a temporary file but would like to avoid it. I wrote a generator that processes each line at a time then following script to feed to the external command as input. however I got "I/O operation on closed file" exception at the second round of the loop:

cmd = ['intersectBed', '-a', 'stdin', '-b', bedfile]
p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for entry in my_entry_generator: # <- this is my generator
    output = p.communicate(input='\t'.join(entry) + '\n')[0]
    print output

I read another similar question that uses p.stdin.write. but still had the same problem.

What I did wrong?

[edit] I replaced last two statements with following (thanks SpliFF):

    output = p.communicate(input='\t'.join(entry) + '\n')
    if output[1]: print "error:", output[1]
    else: print output[0]

to see if there was any error by the external program. But no. Still have the same exception at p.communicate line.


回答1:


The communicate method of subprocess.Popen objects can only be called once. What it does is it sends the input you give it to the process while reading all the stdout and stderr output. And by "all", I mean it waits for the process to exit so that it knows it has all output. Once communicate returns, the process no longer exists.

If you want to use communicate, you have to either restart the process in the loop, or give it a single string that is all the input from the generator. If you want to do streaming communication, sending data bit by bit, then you have to not use communicate. Instead, you would need to write to p.stdin while reading from p.stdout and p.stderr. Doing this is tricky, because you can't tell which output is caused by which input, and because you can easily run into deadlocks. There are third-party libraries that can help you with this, like Twisted.

If you want to do this interactively, sending some data and then waiting for and processing the result before sending more data, things get even harder. You should probably use a third-party library like pexpect for that.

Of course, if you can get away with just starting the process inside the loop, that would be a lot easier:

cmd = ['intersectBed', '-a', 'stdin', '-b', bedfile]
for entry in my_entry_generator:
    p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    output = p.communicate(input='\t'.join(entry) + '\n')[0]
    print output



回答2:


Probably your intersectBed application is exiting with an error but since you aren't printing any stderr data you can't see it. Try:

result = p.communicate(input='\t'.join(entry) + '\n')
if result[1]:
  print "error:", result[1]
else:
  print result[0]


来源:https://stackoverflow.com/questions/9880552/use-generator-as-subprocess-input-got-i-o-operation-on-closed-file-exception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!