How can I process a tarfile with a Python multiprocessing pool?

半城伤御伤魂 提交于 2019-12-03 00:28:00

You're not passing a TarInfo object into the other process, you're passing the result of tar.extractfile(member) into the other process where member is a TarInfo object. The extractfile(...) method returns a file-like object which has, among other things, a read() method which operates upon the original tar file you opened with tar = tarfile.open('test.tar').

However, you can't use an open file from one process in another process, you have to re-open the file. I replaced your test_multiproc() with this:

def test_multiproc():
    tar   = tarfile.open('test.tar')
    files = [name for name in tar.getnames()]
    pool  = Pool(processes=1)
    result = pool.map(read_file2, files)
    tar.close()

And added this:

def read_file2(name):
    t2 = tarfile.open('test.tar')
    print t2.extractfile(name).read()
    t2.close()

and was able to get your code working.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!