问题
I am deploying a Cloud Function with some intensive computing, using the following requirements:
requirements.txt
google-cloud-storage
google-cloud-datastore
numpy==1.16.2
pandas==0.24.2
scikit-image==0.16.1
psutil
memory-profiler==0.55.0
scikit-learn==0.20.3
opencv-python==4.0.0.21
I have set the following arguments for deployment:
[--memory: "2147483648", --runtime: "python37", --timeout: "540", --trigger-http: "True", --verbosity: "debug"]
As the function iterates processing frames, the usage increases, but when reaching 18% - 21%, it stops with a:
"Error: memory limit exceeded. Function invocation was interrupted.
Using psutils to make traces of the code, at the beginning of the call function I have this output (from the function's logs):
"svmem(total=2147483648, available=1882365952, percent=12.3, used=152969216, free=1882365952, active=221151232, inactive=43954176, buffers=0, cached=112148480, shared=24240128, slab=0)"
This should mean, as for my understanding, that only 12.3% is being used at the beginning. It makes sense, as the code packet itself (containing some binaries) plus the raw video chunks all together use 100MB, and I assume that all the installs from the requirements above may use an extra 160MB.
After about 15 iterations, this is the trace of psutil:
svmem(total=2147483648, available=1684045824, percent=21.6, used=351272960, free=1684045824, active=419463168, inactive=43962368, buffers=0, cached=112164864, shared=24240128, slab=0)
Then, function is aborted.
This is the function where code stops:
def capture_to_array(self, capture):
"""
Function to convert OpenCV video capture to a list of
numpy arrays for faster processing and analysis
"""
# List of numpy arrays
frame_list = []
frame_list_hd = []
i = 0
pixels = 0
# Iterate through each frame in the video
while capture.isOpened():
# Read the frame from the capture
ret_frame, frame = capture.read()
# If read successful, then append the retrieved numpy array to a python list
if ret_frame:
i += 1
# Count the number of pixels
height = frame.shape[1]
width = frame.shape[0]
pixels += height * width
# Add the frame to the list if it belong to the random sampling list
if i in self.random_sampler:
# Change color space to have only luminance
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)[:, :, 2]
# Resize the frame
if frame.shape[0] != 1920:
frame_hd = cv2.resize(frame, (1920, 1080), interpolation=cv2.INTER_LINEAR)
else:
frame_hd = frame
frame_list_hd.append(frame_hd)
frame = cv2.resize(frame, (480, 270), interpolation=cv2.INTER_LINEAR)
frame_list.append(frame)
print('Frame size: {}, HD frame size: {}'.format(sys.getsizeof(frame), sys.getsizeof(frame_hd)), i)
print('Frame list size: {}, HD size: {}'.format(sys.getsizeof(frame_list), sys.getsizeof(frame_list_hd)), i)
print(psutil.virtual_memory())
# Break the loop when frames cannot be taken from original
else:
break
# Clean up memory
capture.release()
return np.array(frame_list), np.array(frame_list_hd), pixels, height, width
回答1:
Ok. Got it solved. After this function the created frame lists are called within the following function:
def compute(self, frame_list, frame_list_hd, path, dimensions, pixels):
"""
Function to compare lists of numpy arrays extracting their corresponding metrics.
It basically takes the global original list of frames and the input frame_list
of numpy arrrays to extract the metrics defined in the constructor.
frame_pos establishes the index of the frames to be compared.
It is optimized by means of the ThreadPoolExecutor of Python's concurrent package
for better parallel performance.
"""
# Dictionary of metrics
rendition_metrics = {}
# Position of the frame
frame_pos = 0
# List of frames to be processed
frames_to_process = []
# Iterate frame by frame and fill a list with their values
# to be passed to the ThreadPoolExecutor. Stop when maximum
# number of frames has been reached.
frames_to_process = range(len(frame_list)-1)
print('computing')
# Execute computations in parallel using as many processors as possible
# future_list is a dictionary storing all computed values from each thread
with ThreadPoolExecutor(max_workers=3) as executor:
# Compare the original asset against its renditions
future_list = {executor.submit(self.compare_renditions_instant,
i,
frame_list,
frame_list_hd,
dimensions,
pixels,
path): i for i in frames_to_process}
# Once all frames in frame_list have been iterated, we can retrieve their values
for future in future_list:
# Values are retrieved in a dict, as a result of the executor's process
result_rendition_metrics, frame_pos = future.result()
# The computed values at a given frame
rendition_metrics[frame_pos] = result_rendition_metrics
# Return the metrics for the currently processed rendition
return rendition_metrics
Problem is, because of the ThreadPoolExecutor() was called with no arguments, it was using the default number of workers (5 times the number of available CPUs, which is 2). This was putting a number of frames too large for the memory, hence saturating my system. Provided that each thread was outputting its own psutil data, I was being misled by my own traces.
来源:https://stackoverflow.com/questions/58819497/error-memory-limit-exceeded-when-17-21-is-used-according-to-psutils