Java Garbage Collector - Not running normally at regular intervals

前端未结

关注

 3  1854

终归单人心 2021-02-07 09:09

I have a program that is constantly running. Normally, it seems to garbage collect, and remain under about 8MB of memory usage. However, every weekend, it refuses to garbage col

3条回答

[愿得一人] (楼主)

2021-02-07 10:00
However the only reason this issue was noticed, is because it actually crashed from running out of memory on one weekend i.e. it must have reached the maximum heap size, and not run the garbage collector.

I think your diagnosis is incorrect. Unless there is something seriously broken about your JVM, then the application will only throw an OOME after it has just run a full garbage collect, and discovered that it still doesn't have enough free heap to proceed^*.

I suspect that what is going on here is one or more of the following:
- Your application has a slow memory leak. Each time you restart the application, the leaked memory gets reclaimed. So, if you restart the application regularly during the week, this could explain why it only crashes on the weekend.
- Your application is doing computations that require varying amounts of memory to complete. On that weekend, someone sent it a request that required more memory that was available.
Running the GC by hand is not actually going to solve the problem in either case. What you need to do is to investigate the possibility of memory leaks, and also look at the application memory size to see if it is large enough for the tasks that are being performed.

If you can capture heap stats over a long period, a memory leak will show up as a downwards trend over time in the amount of memory available after full garbage collections. (That is the height of the longest "teeth" of the sawtooth pattern.) A workload-related memory shortage will probably show up as an occasional sharp downwards trend in the same measure over a relatively short period of time, followed by a recovery. You may see both, then you could have both things happening.

^{* Actually, the criteria for deciding when to give up with an OOME are a bit more complicated than this. They depend on certain JVM tuning options, and can include the percentage of time spent running the GC.}

FOLLOWUP

@Ogre - I'd need a lot more information about your application to be able to answer that question (about memory leaks) with any specificity.

With your new evidence, there are two further possibilities:
- Your application may be getting stuck in a loop that leaks memory as a result of the clock time-warping.
- The clock time-warping may cause the GC to think that it is taking too large a percentage of run time and trigger an OOME as a result. This behaviour depends on your JVM settings.
Either way, you should lean hard on your client to get them to stop adjusting the system clock like that. (A 32 minute timewarp is way too much!!). Get them to install a system service to keep the clock in sync with network time hour by hour (or more frequent). Critically, get them to use a service with an option to adjusts the clock in small increments.

(Re the 2nd bullet: there is a GC monitoring mechanism in the JVM that measures the percentage of overall time that the JVM is spending running the GC, relative to doing useful work. This is designed to prevent the JVM from grinding to a halt when your application is really running out of memory.

This mechanism would be implemented by sampling the wall-clock time at various points. But if the wall-clock time is timewarped at a critical point, it is easy to see how the JVM may think that a particular GC run took much longer than it actually did ... and trigger the OOME.)
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...