I have large hash tables that I am writing to disk as an occasional backup. I am finding that as I map the hash tables and write to a file, the RAM usage skyrockets compared to
Try loop
to loop over the hash-tables.
sth like:
(loop for k1
being the hash-key
using (hash-value v1) of (customer-var1 cust-data)
do (format s "~A ~A~%" k1 v1))
Or if you don't need the values:
(loop for k being the hash-key of (customer-var2 cust-data)
do (format <whatever you need...>))
Originally I thought maphash
would collect values but it does not, as @tfb pointed out. Then I don't know.
This isn't a complete answer. I think that whatever is causing the leakage is SBCL-specific, so probably your best bet is to find out where the SBCL people hang out (assuming it's not here) and ask them.
However one thing to do would be to instrument the GC to see if you can work out what's going on. You can do this by, for instance:
(defun dribble-gc-info ()
(format *debug-io* "~&GC: ~D bytes consed~%"
(sb-ext:get-bytes-consed)))
(defun hook-gc (&optional (log-file nil))
(pushnew 'dribble-gc-info sb-ext:*after-gc-hooks*)
(when log-file
(setf (sb-ext:gc-logfile) log-file)))
(defun unhook-gc ()
(setf sb-ext:*after-gc-hooks*
(delete 'dribble-gc-info sb-ext:*after-gc-hooks*))
(if (sb-ext:gc-logfile)
(prog1 (sb-ext:gc-logfile)
(setf (sb-ext:gc-logfile) nil))
nil))
Then (hook-gc "/tmp/x.out")
will cause it to both tell you when GCs run and how much memory has been consumed in total, and write copious information to /tmp/x.out
. It may be that this would at least give you a start in working out what's happening.
Another thing which just conceivably might help would be to insert occasional calls to force-output
on the stream you're writing to: it's possible (but I think unlikely) that some weird buffering is going on which is causing it to make bad decisions about how big the lisp-side buffer for the file should be.