kubernetes pod memory - java gc logs

后端 未结 2 1695
太阳男子
太阳男子 2020-12-09 13:38

On the kubernetes dashboard, there\'s a pod wherein the Memory Usage (bytes) is displayed as 904.38Mi.

This pod holds the java app that was ran with

相关标签:
2条回答
  • 2020-12-09 14:04

    GC deals with a subset of memory used by process. There are regions of JVM memory which are not subject for garbage collection.

    Below are few memory areas are not included in heap / metaspace

    • Thread Stack Space
    • Compressed Class Space
    • JIT compiled code
    • NIO direct buffer memory

    List above is not full, these are just largest memory consumers.

    Here is JVM memory hierarchy diagram with related config options.

    In summary, actual JVM memory appetite is always large than heap limit.

    How much large depends on nature of application and could be established empirically.

    UPDATE

    Java Native Memory Tracking could be enabled in JVM to provide detailed reports related to memory usage across different functional areas.

    0 讨论(0)
  • 2020-12-09 14:22

    There are a lot of things going on here. Let's take each at a time.

    It seems you are using a single container per pod (though you could have many containers per pod). The requests.memory and limits.memory are specific to the container, Kubernetes computes limits and requests per pod as the sum of all the containers limitations.

    So think about it - you are saying that a pod shows 904.38Mi, but you show requests.memory and limits.memory, which is per container. That is why I assume you have one container per pod. This is a general intro that does not answer your question - but we will get there.

    Then comes the fact that a pod is started by docker, which is started with kubectl and that reads requires.memory and limits.memory. To make this a bit simpler : what you set in limits.memory, will be passed as docker -m. So, in your case, the total memory used for the docker process is 1.5GC. Remember that is the entire process limit, not just the heap. A java process is a lot more than the heap, which you specify with -Xms512m -Xmx1024m. So to answer your question:

    How did kubernetes arrived at 904.38Mi usage?

    This is what the entire process is currently taking, not just the heap. From the very short log files that you posted - your application is just fine.

    EDIT

    I actually did not had kubernetes dashboard in my environment to test this specifically, so had to install it to really understand what is going on. I had a hint against the majority of things, but to make sure, I did some tests.

    First things first: what does that number in the dashboard mean? Took a while to find/understand, but that is the actual resident memory of the process, which actually is a very good thing.

    Any sane OS knows that when someone requests memory from it, it rarely needs/utilizes it all, as such, it gives the memory to it in a lazy fashion. This is easy to prove in k8s. Suppose I have a jdk-13 JVM and start it with:

    kubectl run jdk-13 
        --image=jdk-13 
        --image-pull-policy=Never 
        --limits "memory=100Mi" 
        --requests "memory=10Mi" 
        --command -- /bin/sh -c "while true; do sleep 5; done".
    

    Notice requests.memory=10Mi and limits.memory=100Mi. Reading the answer from the beginning, you already know that the specific pod will be started with docker -m 100m... because limits.memory=100Mi. That is easy to prove, just sh into the pod:

     kubectl exec -it jdk-13-b8d656977-rpzrg -- /bin/sh
    

    and find out what the cgroup says:

     # cat /sys/fs/cgroup/memory/memory.limit_in_bytes
     104857600 // 100MB
    

    perfect! so the pod's memory limitation is 100 MB max, but what is the current memory utilization, that is what is the resident memory taken?

    kubectl top pod
       NAME                          CPU(cores)   MEMORY(bytes)
       jdk-13-b8d656977-rpzrg           1m           4Mi
    

    OK, so the current memory utilization is only 4MB. You can "make sure" that this is indeed accurate, if you do:

    kubectl exec -it jdk-13-b8d656977-rpzrg -- /bin/sh
    

    and inside that pod issue:

    top -o %MEM
    

    and notice that RES memory is on par with the one reported via dashboard or kubectl top pod.

    And now let's do a test. Suppose I have this very simple code in that pod:

    // run this with: java "-Xlog:gc*=debug" -Xmx100m -Xms20m  HeapTest
    import java.time.temporal.ChronoUnit;
    import java.util.Arrays;
    import java.util.concurrent.TimeUnit;
    import java.util.concurrent.locks.LockSupport;
    
    public class HeapTest {
    
        public static void main(String[] args) throws Exception {
    
            // allocate 1 MB every 3 seconds
            for (int i = 0; i < 40; ++i) {
                byte[] b = new byte[1024 * 1024 * 1];
                b[i] = 1;
                System.out.println(Arrays.hashCode(b));
                LockSupport.parkNanos(TimeUnit.of(ChronoUnit.SECONDS).toNanos(3));
            }
        }
    }
    
        
    

    I allocate 1MB every 3 seconds for around 2 minutes. When I look at this process in dashboard, I do see that at some point in time, the memory grows. After the program ends, dashboard reports the drop in memory back. Good! That means the memory is given back and RSS memory drops. This is how it looks like in dashboard:

    Now let's change this code just a bit. Let's add some GC in there and let's never finish this process (you know like typical spring-boot applications do):

    import java.time.temporal.ChronoUnit;
    import java.util.Arrays;
    import java.util.concurrent.TimeUnit;
    import java.util.concurrent.locks.LockSupport;
    
    public class HeapTest {
    
        public static void main(String[] args) throws Exception {
    
            // allocate 1 MB every 3 seconds
            for (int i = 0; i < 40; ++i) {
                byte[] b = new byte[1024 * 1024 * 1];
                b[i] = 1;
                System.out.println(Arrays.hashCode(b));
                LockSupport.parkNanos(TimeUnit.of(ChronoUnit.SECONDS).toNanos(3));
            }
            for (int i = 0; i < 10; i++) {
                Thread.sleep(500);
                System.gc();
            }
            
            while (true) {
                try {
                    Thread.sleep(TimeUnit.of(ChronoUnit.SECONDS).toMillis(5));
                    Thread.onSpinWait();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }
    
        }
    }
    

    I run this with:

    java "-Xlog:heap*=debug" 
         "-Xlog:gc*=debug" 
         "-Xlog:ergo*=debug" 
         -Xmx100m 
         -Xms20m
         HeapTest
    

    on inspecting logs (just like in your example), I do see that heap is collected just fine. But when I look at the dashboard, the memory does not drop (unlikes the previous example).

    Once G1GC takes memory, it is not very eager to give it back to the OS. It can do that in rare cases, here is one example or you can instruct it to do so.

    Both ways are rather painful, instead there are GC algorithms that are smarter (and in general a lot better). My personal love goes to Shenandoah, let's see what it does. If I slightly change the code (so that I could prove my point better):

    import java.time.temporal.ChronoUnit;
    import java.util.Arrays;
    import java.util.concurrent.TimeUnit;
    import java.util.concurrent.locks.LockSupport;
    
    public class HeapTest {
    
        public static void main(String[] args) throws Exception {
    
            // allocate 1/4 MB every 100 ms
            for (int i = 0; i < 6000; ++i) {
                byte[] b = new byte[1024 * 256];
                b[i] = 1;
                System.out.println(Arrays.hashCode(b));
                LockSupport.parkNanos(TimeUnit.of(ChronoUnit.MILLIS).toNanos(100));
            }
    
            while (true) {
                try {
                    Thread.sleep(TimeUnit.of(ChronoUnit.SECONDS).toMillis(5));
                    Thread.onSpinWait();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }
    
        }
    }
    

    And run it with:

     java "-Xlog:gc*=debug" 
          "-Xlog:ergo*=debug" 
          "-Xlog:heap*=debug" 
           -XX:+UnlockExperimentalVMOptions 
           -XX:+UseShenandoahGC 
           -XX:+ShenandoahUncommit 
           -XX:ShenandoahGCHeuristics=compact  
           -Xmx1g 
           -Xms1m  
           HeapTest
    

    Here is what you are going to see:

    And you should, for a fact care about this:

    This behavior is particularly disadvantageous in container environments where resources are paid by use. Even during phases where the VM only uses a fraction of its assigned memory resources due to inactivity, G1 will retain all of the Java heap. This results in customers paying for all resources all the time, and cloud providers not being able to fully utilize their hardware.

    P.S. I would also add to that the fact that other pods are suffering too, because one pod decided to take as much memory as it could, at a particular spike, and never give it back.

    0 讨论(0)
提交回复
热议问题