问题
I'm working on the some kind of a system service (actually it's just a log parser) written in Python. This program should work continuously for a long time (hopefully I mean days and weeks without failures and needs of restart). That's why I am concerned about memory consumption.
I put together different information about process memory usage from different sites into one simple function:
#!/usr/bin/env python
from pprint import pprint
from guppy import hpy
from datetime import datetime
import sys
import os
import resource
import re
def debug_memory_leak():
#Getting virtual memory size
pid = os.getpid()
with open(os.path.join("/proc", str(pid), "status")) as f:
lines = f.readlines()
_vmsize = [l for l in lines if l.startswith("VmSize")][0]
vmsize = int(_vmsize.split()[1])
#Getting physical memory size
pmsize = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
#Analyzing the dynamical memory segment - total number of objects in memory and heap size
h = hpy().heap()
if __debug__:
print str(h)
m = re.match(
"Partition of a set of ([0-9]+) objects. Total size = ([0-9]+) bytes(.*)", str(h))
objects = m.group(1)
heap = int(m.group(2))/1024 #to Kb
current_time = datetime.now().strftime("%H:%M:%S")
data = (current_time, objects, heap, pmsize, vmsize)
print("\t".join([str(d) for d in data]))
This function has been used to study the dynamics of the memory consumption of my long-playing process, and I still cannot explain its behavior. You can see that the heap size and total amount of the objects did not changed while the physical and virtual memory increased by 11% and 1% during these twenty minutes.
UPD: The process has been working for almost 15 hours by this moment. The heap is still the same, but the physical memory increased sixfold and the virtual memory increased by 50%. The curve is seem to be linear excepting the strange outliers at 3:00 AM:
Time Obj Heap PhM VM
19:04:19 31424 3928 5460 143732
19:04:29 30582 3704 10276 158240
19:04:39 30582 3704 10372 157772
19:04:50 30582 3709 10372 157772
19:05:00 30582 3704 10372 157772
(...)
19:25:00 30583 3704 11524 159900
09:53:23 30581 3704 62380 210756
I wonder what is going on with the address space of my process. The constant size of heap suggests that all of the dynamical objects are deallocated correctly. But I have no doubt that growing memory consumption will affect the sustainability of this life-critical process in the long run.
Could anyone clarify this issue please? Thank you.
(I use RHEL 6.4, kernel 2.6.32-358 with Python 2.6.6)
回答1:
Without knowing what your program is doing, this might help.
I came across this article when working on a project a while back: http://chase-seibert.github.io/blog/2013/08/03/diagnosing-memory-leaks-python.html Which says, "Long running Python jobs that consume a lot of memory while running may not return that memory to the operating system until the process actually terminates, even if everything is garbage collected properly."
I ended up using the multiprocessing module to have my project fork a separate process and return when it needed to do work, and I haven't noticed any memory issues since.
That or try it in Python 3.3 http://bugs.python.org/issue11849
来源:https://stackoverflow.com/questions/23369937/python-memory-consumption-on-linux-physical-and-virtual-memory-are-growing-whil