问题
Here's a MWE
of a much larger code I'm using. It performs a Monte Carlo integration over a KDE (kernel density estimate) for all values located below a certain threshold (the integration method was suggested over at this question: Integrate 2D kernel density estimate) iteratively for a number of points in a list and returns a list made of these results.
import numpy as np
from scipy import stats
from multiprocessing import Pool
import threading
# Define KDE integration function.
def kde_integration(m_list):
# Put some of the values from the m_list into two new lists.
m1, m2 = [], []
for item in m_list:
# x data.
m1.append(item[0])
# y data.
m2.append(item[1])
# Define limits.
xmin, xmax = min(m1), max(m1)
ymin, ymax = min(m2), max(m2)
# Perform a kernel density estimate on the data:
x, y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
values = np.vstack([m1, m2])
kernel = stats.gaussian_kde(values)
# This list will be returned at the end of this function.
out_list = []
# Iterate through all points in the list and calculate for each the integral
# of the KDE for the domain of points located below the value of that point
# in the KDE.
for point in m_list:
# Compute the point below which to integrate.
iso = kernel((point[0], point[1]))
# Sample KDE distribution
sample = kernel.resample(size=1000)
#Choose number of cores and split input array.
cores = 4
torun = np.array_split(sample, cores, axis=1)
# Print number of active threads.
print threading.active_count()
#Calculate
pool = Pool(processes=cores)
results = pool.map(kernel, torun)
#Reintegrate and calculate results
insample_mp = np.concatenate(results) < iso
# Integrate for all values below iso.
integral = insample_mp.sum() / float(insample_mp.shape[0])
# Append integral value for this point to list that will return.
out_list.append(integral)
return out_list
# Generate some random two-dimensional data:
def measure(n):
"Measurement model, return two coupled measurements."
m1 = np.random.normal(size=n)
m2 = np.random.normal(scale=0.5, size=n)
return m1+m2, m1-m2
# Create list to pass to KDE integral function.
m_list = []
for i in range(100):
m1, m2 = measure(5)
m_list.append(m1.tolist())
m_list.append(m2.tolist())
# Call KDE integration function.
print 'Integral result: ', kde_integration(m_list)
The multiprocessing
in the code was suggested over at this question Speed up sampling of kernel estimate to speed up the code (which it does up to ~3.4x).
The code works ok until I try to pass to the KDE function a list of more than ~62-63 elements (ie: I set a value over 63 in the line for i in range(100)
) If I do that I get the following error:
Traceback (most recent call last):
File "~/gauss_kde_temp.py", line 78, in <module>
print 'Integral result: ', kde_integration(m_list)
File "~/gauss_kde_temp.py", line 48, in kde_integration
pool = Pool(processes=cores)
File "/usr/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 144, in __init__
self._worker_handler.start()
File "/usr/lib/python2.7/threading.py", line 494, in start
_start_new_thread(self.__bootstrap, ())
thread.error: can't start new thread
usually (9 out of 10 times) around the active thread 374
. I'm way out of my league in terms of python
coding here and I have no clue as to how I could fix this issue. Any help will be much appreciated.
Add
I tried adding a while
loop to prevent the code from using too many threads. What I did was replacing the print threading.active_count()
line by this bit of code:
# Print number of active threads.
exit_loop = True
while exit_loop:
if threading.active_count() < 300:
exit_loop = False
else:
# Pause for 10 seconds.
time.sleep(10.)
print 'waiting: ', threading.active_count()
The code halted (ie: got stuck inside the loop) when it reached 302
active threads. I waited for more than 10 minutes and the code never exited the loop and the number of active threads never dropped from 302
. Shouldn't the number of active threads diminish after a while?
来源:https://stackoverflow.com/questions/18602236/thread-error-cant-start-new-thread