Garbage collector in Ruby 2.2 provokes unexpected CoW

前端 未结 1 801
我寻月下人不归
我寻月下人不归 2021-01-11 21:07

How do I prevent the GC from provoking copy-on-write, when I fork my process ? I have recently been analyzing the garbage collector\'s behavior in Ruby, due to some memory

相关标签:
1条回答
  • 2021-01-11 21:52

    UPD2

    Suddenly figured out why all the memory is going private if you format the string -- you generate garbage during formatting, having GC disabled, then enable GC, and you've got holes of released objects in your generated data. Then you fork, and new garbage starts to occupy these holes, the more garbage - more private pages.

    So i added a cleanup function to run GC each 2000 cycles (just enabling lazy GC didn't help):

    count.times do |i|
      cleanup(i)
      result << "%20.18f" % rand
    end
    
    #......snip........#
    
    def cleanup(i)
          if ((i%2000).zero?)
            GC.enable; GC.start; GC.disable
          end
    end   
    
    ##### main #####
    

    Which resulted in(with generating memory_object( 1000 * 1000 * 10) after fork):

    RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 0
    ruby version 2.2.0
     proces   pid log          priv_dirty shared_dirty
     Parent  2501 post alloc           35          0
     Parent  2501 4 fork                0         35
     Child   2503 4 initial             0         35
     Child   2503 8 empty GC           28         22
    

    Yes, it affects performance, but only before forking, i.e. increase load time in your case.


    UPD1

    Just found criteria by which ruby 2.2 sets old object bits, it's 3 GC's, so if you add following before forking:

    GC.enable; 3.times {GC.start}; GC.disable
    # start the forking
    

    you will get(the option is 1 in command line):

    $ RUBY_GC_HEAP_INIT_SLOTS=600000 ruby gc-test.rb 1
    ruby version 2.2.0
     proces   pid log          priv_dirty shared_dirty
     Parent  2368 post alloc           31          0
     Parent  2368 4 fork                1         34
     Child   2370 4 initial             1         34
     Child   2370 8 empty GC            2         32
    

    But this needs to be further tested concerning the behavior of such objects on future GC's, at least after 100 GC's :old_objects remains constant, so i suppose it should be OK

    Log with GC.stat is here


    By the way there's also option RGENGC_OLD_NEWOBJ_CHECK to create old objects from the beginning, but i doubt it's a good idea, but may be useful for a particular case.

    First answer

    My proposition in the comment above was wrong, actually bitmap tables are the savior.

    (option = 1)
    
    ruby version 2.0.0
     proces   pid log          priv_dirty shared_dirty
     Parent 14807 post alloc           27          0
     Parent 14807 4 fork                0         27
     Child  14809 4 initial             0         27
     Child  14809 8 empty GC            6         25 # << almost everything stays shared <<
    

    Also had by hand and tested Ruby Enterprise Edition it's only half better than worst cases.

    ruby version 1.8.7
     proces   pid log          priv_dirty shared_dirty
     Parent 15064 post alloc           86          0
     Parent 15064 4 fork                2         84
     Child  15065 4 initial             2         84
     Child  15065 8 empty GC           40         46
    

    (I made the script run strictly 1 GC, by increasing RUBY_GC_HEAP_INIT_SLOTS to 600k)

    0 讨论(0)
提交回复
热议问题