Multiprocessing an array in chunks

半城伤御伤魂 提交于 2021-01-28 21:53:20

问题


I have broken my standard deviation function into small chunks of std_1, std_2, std_3 etc. to optimize my code to make it run faster. Since I have over 2 million arrays on my main numpy array PC_list. I have used the numba, numpy arrays and multi processing to make the code run faster however I do not see any performance difference in the way even doe the code is broken into pieces from the main function. It takes about 57 seconds for the main function to process and the divided function to compute which makes me believe that the multi processing does not work. Although it takes about 12 to 13 seconds to go through each divided function std1, std2, std3, std4, std5 std6. The PC_list array posted below is a short bit of the actual array. I have also tried running the main function solely on numpy but it hasn't worked and has given me a memory issue, link to previouse issue prev issue.

PC_list array:

number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

Main function(entire list length used) takes 58 seconds to run:

@njit
def std():
    std1 = np.array([PC_list[i:i+number].std() for i in range(0, (len(PC_list)-number))])
    return(std1)

Time it takes for each chunk to run:

multi processing functions(list split into chunks) takes 58 seconds to run

divide = (len(PC_list)-number)/6
@njit
def std_1():
    print("std1")
    std1 = np.array([PC_list[i:i+number].std() for i in range(0, round(divide))])
    return(std1)
@njit
def std_2():
    print("std2")
    std2 = np.array([PC_list[i:i+number].std() for i in range(round(divide), round(divide*2))])
    return(std2)
@njit
def std_3():
    print("std3")
    std3 = np.array([PC_list[i:i+number].std() for i in range(round(divide*2), round(divide*3))])
    return(std3)
@njit
def std_4():
    print("std4")
    std4 = np.array([PC_list[i:i+number].std() for i in range(round(divide*3), round(divide*4))])
    return(std4)
@njit
def std_5():
    print("std5")
    std5 = np.array([PC_list[i:i+number].std() for i in range(round(divide*4), round(divide*5))])
    return(std5)
@njit
def std_6():
    print("std6")
    std6 = np.array([PC_list[i:i+number].std() for i in range(round(divide*5), round(divide*6))])
    return(std6) 
#setting up the multi processing vars     
p1 = multiprocessing.Process(target=std_1)
p2 = multiprocessing.Process(target=std_2)
p3 = multiprocessing.Process(target=std_3)
p4 = multiprocessing.Process(target=std_4)
p5 = multiprocessing.Process(target=std_5)
p6 = multiprocessing.Process(target=std_6)
#running the multi processes 
if __name__ == "__main__":
    p1.start()
    p2.start()
    p3.start()
    p4.start()
    p5.start()
    p6.start()
std1 = std_1()
std2 = std_2()
std3 = std_3()
std4 = std_4()
std5 = std_5()
std6 = std_6()
#Combinning all the standard deviation chunks together
std = np.concatenate((std1, std2, std3, std4, std5, std6))

来源:https://stackoverflow.com/questions/65854842/multiprocessing-an-array-in-chunks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!