问题
Scenario-
I am running B* instances on App Engine. I've a background ETL related task(written in python) scheduled as a cron job on App Engine. When time arrives, cron initiates a http request to start the task and runs without returning a response till the task gets completed. When task was executing, it was typically consuming "X" MB of RAM. After the task got finished and returned 200 OK, App Engine Instance monitoring is still showing "X" MB of RAM in use.
Please help me understand the following -
- If an instance is running only one task and after completing it, when will memory get freed that was consumed by this task?
- Do I need to run
gc.collect()
to call the garbage collector explicitly to free up RAM ? - The only way to free up RAM is to restart the instance ?
PS: This is not at all related to NDB, my task is taking input from Bigquery, performing some ETL operation and then streaming it to Bigquery.
回答1:
From my observations with an app using lots of memory for StringIO
operations:
explicitly calling
gc.collect()
didn't noticeably help (I even suspected for a while that I actually have memory leaks, but it wasn't the case)the memory is not freed after each and every request, but, if the instance remains alive long enough without running out of memory it does eventual appears to be freed now and then. Easy to test - just increase the time between requests to reduce the free memory draining rate. But I couldn't figure out a usable pattern. Note that I observed this only after upgrading to
B2
instances, myB1
instances were running out of memory too fast, I never noticed a freeing event with them.using an instance class with more memory (which I tried as a workaround for my instances eventually running out of memory) helped - the memory appeared to be freed more often. It might be because these instances also have a faster CPU (but that's just guesswork).
回答2:
There are a few questions on StackOverflow describing similar memory issues for tasks when using ndb on app engine. Here is one example.
The issue is that app engine doesn't clear the ndb context cache upon the conclusion of a task so context cache continues to hog your memory long after the task completes.
The solution is to not use or clear the context cache during your tasks. Here are a few ways:
- Bypass caching with
key.get(use_cache=False)
- Call
ndb.get_context().clear_cache()
at appropriate times. - Disable caching for all entities of a kind by adding
_use_cache = False
to your model definition.
来源:https://stackoverflow.com/questions/45026920/when-will-memory-get-freed-after-completing-the-request-on-app-engine-backend-in