How do I prevent the GC from provoking copy-on-write, when I fork my process ? I have recently been analyzing the garbage collector\'s behavior in Ruby, due to some memory
Suddenly figured out why all the memory is going private if you format the string -- you generate garbage during formatting, having GC disabled, then enable GC, and you've got holes of released objects in your generated data. Then you fork, and new garbage starts to occupy these holes, the more garbage - more private pages.
So i added a cleanup function to run GC each 2000 cycles (just enabling lazy GC didn't help):
count.times do |i|
cleanup(i)
result << "%20.18f" % rand
end
#......snip........#
def cleanup(i)
if ((i%2000).zero?)
GC.enable; GC.start; GC.disable
end
end
##### main #####
Which resulted in(with generating memory_object( 1000 * 1000 * 10)
after fork):
RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 0
ruby version 2.2.0
proces pid log priv_dirty shared_dirty
Parent 2501 post alloc 35 0
Parent 2501 4 fork 0 35
Child 2503 4 initial 0 35
Child 2503 8 empty GC 28 22
Yes, it affects performance, but only before forking, i.e. increase load time in your case.
Just found criteria by which ruby 2.2 sets old object bits, it's 3 GC's, so if you add following before forking:
GC.enable; 3.times {GC.start}; GC.disable
# start the forking
you will get(the option is 1
in command line):
$ RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 1
ruby version 2.2.0
proces pid log priv_dirty shared_dirty
Parent 2368 post alloc 31 0
Parent 2368 4 fork 1 34
Child 2370 4 initial 1 34
Child 2370 8 empty GC 2 32
But this needs to be further tested concerning the behavior of such objects on future GC's, at least after 100 GC's :old_objects
remains constant, so i suppose it should be OK
Log with GC.stat
is here
By the way there's also option RGENGC_OLD_NEWOBJ_CHECK to create old objects from the beginning, but i doubt it's a good idea, but may be useful for a particular case.
My proposition in the comment above was wrong, actually bitmap tables are the savior.
(option = 1)
ruby version 2.0.0
proces pid log priv_dirty shared_dirty
Parent 14807 post alloc 27 0
Parent 14807 4 fork 0 27
Child 14809 4 initial 0 27
Child 14809 8 empty GC 6 25 # << almost everything stays shared <<
Also had by hand and tested Ruby Enterprise Edition it's only half better than worst cases.
ruby version 1.8.7
proces pid log priv_dirty shared_dirty
Parent 15064 post alloc 86 0
Parent 15064 4 fork 2 84
Child 15065 4 initial 2 84
Child 15065 8 empty GC 40 46
(I made the script run strictly 1 GC, by increasing RUBY_GC_HEAP_INIT_SLOTS
to 600k)