Python multiprocessing.Pool map() “TypeError: string indices must be integers, not str”

和自甴很熟 提交于 2020-01-14 09:23:07

问题


I am attempting to use multiprocessing.Pool to do parallel processing on a list of dictionaries. An example is below

(Please note: this is a toy example, my actual example will be doing cpu-intensive processing on the values in the actual dictionary)

import multiprocessing

my_list = [{'letter': 'a'}, {'letter': 'b'}, {'letter': 'c'}]

def process_list(list_elements):
    ret_list = []
    for my_dict in list_elements:
        ret_list.append(my_dict['letter'])
    return ret_list

if __name__ == "__main__":
    pool = multiprocessing.Pool()
    letters = pool.map(process_list, my_list)
    print letters

If I run the code above, I get the following error:

Traceback (most recent call last):
  File "multiprocess_fail.py", line 13, in <module>
    letters = pool.map(process_list, my_list)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 250, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 554, in get
    raise self._value
TypeError: string indices must be integers, not str

I don't know what string indices it is referring to. Shouldn't pool.map just be iterating over the items in my_list (i.e. the dictionaries)? Do I have to alter how the data is being passed to the map function to get it to run?


回答1:


pool.map() takes a callable and an iterable, then proceeds to apply the callable to each element in iterable. It'll divide the work across the pool workers in chunks, but the function will only ever be passed one element at a time.

You passed in a list of dictionaries, which means that each process_list() is passed one dictionary:

process_list({'letter': 'a'})
process_list({'letter': 'b'})
# etc.

Your code however is treating the list_elements as a list. The for loop:

for my_dict in list_elements:

instead sees dictionary keys, each my_dict is bound to a key at a time. For your dictionaries, that means there is one iteration, and my_dict is set to 'letter' each time. The line:

my_dict['letter']

then tries to index into that string, and 'letter'['letter'] throws the exception you saw.

The following works:

def process_list(list_element):
    return list_element['letter']

You'd return one result; map() gathers all results into a new list and returns that when all workers are done.



来源:https://stackoverflow.com/questions/22411424/python-multiprocessing-pool-map-typeerror-string-indices-must-be-integers-n

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!