问题
I'm currently testing my proof of concept prototype dealing with XML schema, and built around a very memory consuming external library for tree automata (for which I've got the sources), I'd like to plot "real peak" (heap) memory consumption of the different runs with increasing schema sizes (the metric used fits my purpouse and do no affect the question), or at least a reasonable approximation of it.
To give an order of magnitude, for a run with a real peak of 100MB (I tested it running several times exactly the same configuration of input/parameters, forcing the jvm memory with -Xmx and -Xms to decreasing value, I get Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded < 100MB, with stable and repeatable results) it occupy around 1.1GB, that's why it is extremely important for me to get the real number, because they differs a lot!
I've spent the last 10 days reading questions on the web and in stackoverflow, what I actually know is:
System.gc() "suggest" a GC run, does not force it in any way, so it is not possible to rely on it for detecting memory usage peaks
What is usually suggested is to count object occupation (I saw SizeOf project for this, I tried and works fine, even if it does not fits my needs), that is not feasible for me because heavy memory allocation happens due to the creation of a lot of collection (set, list and map) iterators in different methods, called a very high number of times (say millions each for a run of 10 minutes for what I remember), so it would be extremely difficult to detect all the involved objects and performing the sums (I debugged many many runs in days with memory consumption graphs without being able to identify only a single bottle-neck)
There is no way to easily obtain the memory occupation of a method (expressed as the peak of object memory allocation)
The fact is that I experienced by myself that System.gc() calls are not reliable (e.g. different runs of the same configuration, different memory read after a System.gc() due to the GC being really called or not), but when I press the "GC button" in JVisualVM or Jconsole it never fails to run GC or refuses to do so.
So my question is: calling their implementation of that button (I didn't try it yet but for what I've read up to now it seems feasible using jconsole.jar with attach api) will differ from calling System.gc() directly from my code, thus solving my problem? If not, how do you explain the "deterministc behaviour" of that button?
Up to now I did some manual test of real memory peak given 10 increasing schema sizes (for this kind of measurement the schemas are automatically generated from a single "complexity parameter") and I plotted the expected curve, if I will not be able to obtain a better solution I want to run my code as an external jar with -Xmx/-Xms equal to slightly less than my prediction of the expected memory peak, catching the OutMemoryException in the external process ErrorStream and relaunching with increased memory until a complete run is achieved. (If the naive memory prediction will not be robust enough I will apply appropriate Machine Learning techniques). I know that this is not an elegant solution but in my scenario (academia) I can afford to spend some extra time for these measurements. If you have other suggestions or improvement to this bruteforce method you are (extremely) welcome to share them.
System info (machine is a Fedora 17, 64 bit):
java version "1.7.0_04" Java(TM) SE Runtime Environment (build 1.7.0_04-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode)
Thanks in advance, Alessandro
回答1:
I have quite a bit of positive experience with this trivial approach:
System.gc();
Thread.sleep(500);
System.gc();
One GC run is often not enough due to object finalization issues, where an object may be resurrected in finalization. Therefore additional memory is released in the second GC run.
Do note that this, as well as ony other, seemingly "smarter", approaches, are all heuristics and quite dependant on the exact version of JVM, including its GC configuration. But in many cases you will not be so much interested in generality: if it works right now and allows you to do your measurements, it is the way to go.
回答2:
As far as I know, Jconsole or any other tool, uses System.gc() only. There is no other option. As everyone know, java tells everyone not to rely on System.gc(), but that doesn't mean it doesn't work at all.
So coming to your query, you seem to be concerned that how come pressing that button calls GC directly & still java says System.gc only "suggests" to call GC. I say, that button also calls System.gc() & it only "suggests" java to try for GC, & it happens somehow that java decides to perform GC at that time itself (its not guaranteed but somehow java does it.)
So to prove this fact, I just created simple program which just creates loads of objects. It has commented line with "System.gc()". Now try running this same program first with commented System.gc() & then by uncommenting System.gc(). Make sure to provide VM arguments as -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails.
package ravi.tutorial.java.gc;
/**
* Just to test GC. RUn with below VM arguments.
*
* -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
*
*
* @author ravi.k
*
*/
public class TestGC {
public static A a;
/**
* @param args
* @throws InterruptedException
*/
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 100; i++) {
populateObjects();
System.out.println("population done for batch: " + i);
}
}
public static void populateObjects() {
for (int i = 0; i < 100000; i++) {
a = new A("A");
}
//System.gc();
}
}
class A {
String s;
public A(String s) {
this.s = s;
}
}
Here partial outputs frmm my machine.
Commened System.gc(): Here GC is called at will of jre.
population done for batch: 0
population done for batch: 1
population done for batch: 2
population done for batch: 3
population done for batch: 4
population done for batch: 5
population done for batch: 6
population done for batch: 7
population done for batch: 8
population done for batch: 9
0.332: [GC 0.332: [ParNew: 17024K->410K(19136K), 0.0024479 secs] 17024K->410K(83008K), 0.0025219 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
population done for batch: 10
population done for batch: 11
population done for batch: 12
population done for batch: 13
population done for batch: 14
population done for batch: 15
population done for batch: 16
population done for batch: 17
population done for batch: 18
population done for batch: 19
0.344: [GC 0.344: [ParNew: 17434K->592K(19136K), 0.0011238 secs] 17434K->592K(83008K), 0.0011645 secs] [Times: user=0.00 sys=0.01, real=0.00 secs]
population done for batch: 20
population done for batch: 21
population done for batch: 22
population done for batch: 23
population done for batch: 24
population done for batch: 25
population done for batch: 26
population done for batch: 27
population done for batch: 28
population done for batch: 29
population done for batch: 30
0.353: [GC 0.353: [ParNew: 17616K->543K(19136K), 0.0011398 secs] 17616K->543K(83008K), 0.0011770 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
population done for batch: 31
population done for batch: 32
population done for batch: 33
Uncommented System.gc(): Here GC is called for every batch. Now System.gc() is only suggesting GC, but java is choosing to run GC at that time itself. This is exact same case for that magic GC button in other tools :)
0.337: [Full GC (System) 0.337: [CMS: 0K->400K(63872K), 0.0219250 secs] 3296K->400K(83008K), [CMS Perm : 4423K->4422K(21248K)], 0.0220152 secs] [Times: user=0.04 sys=0.00, real=0.02 secs]
population done for batch: 0
0.364: [Full GC (System) 0.364: [CMS: 400K->394K(63872K), 0.0161792 secs] 2492K->394K(83008K), [CMS Perm : 4425K->4425K(21248K)], 0.0162336 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]
population done for batch: 1
0.382: [Full GC (System) 0.382: [CMS: 394K->394K(63872K), 0.0160193 secs] 2096K->394K(83008K), [CMS Perm : 4425K->4425K(21248K)], 0.0160834 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
population done for batch: 2
0.399: [Full GC (System) 0.399: [CMS: 394K->394K(63872K), 0.0160866 secs] 2096K->394K(83008K), [CMS Perm : 4425K->4425K(21248K)], 0.0161489 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
population done for batch: 3
0.417: [Full GC (System) 0.417: [CMS: 394K->394K(63872K), 0.0156326 secs] 2096K->394K(83008K), [CMS Perm : 4425K->4425K(21248K)], 0.0156924 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
population done for batch: 4
0.434: [Full GC (System) 0.434: [CMS: 394K->394K(63872K), 0.0157274 secs] 2096K->394K(83008K), [CMS Perm : 4425K->4425K(21248K)], 0.0157897 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
population done for batch: 5
To add more, its just like threads. There is no guarantee when thread runs, but whenever we write any sample thread program, thread runs that time itself. So we should not be blaming java how come it ran as soon as thread started :). Java is only saying not to rely on these things, but they do work. Also though they work in some cases doesn't mean they will work everytime. Even those jconsole tools may fail to execute GC, just that we never seen it.
回答3:
1) System.gc() "suggest" a GC run, does not force it in any way, so it is not possible to rely on it for detecting memory usage peaks
Thats what the spec says but if you use OpenJDK or HotSpot it will always perform a Full GC unless you turn it off.
What is usually suggested is to count object occupation
I would suggest using a commercial memory profiler. I would have the JVM start with a maximum of 8 GB and see how much it tries to use. After that I would increase or decrease it based on your judgement on whether it would like more or doesn't appear to be using it.
There is no way to easily obtain the memory occupation of a method (expressed as the peak of object memory allocation)
The only memory a method uses is on the stack. You can trace how much objects (count,classes,size) where creating in a method but those objects don't belong to that method and can be used anywhere, even after the method has returned.
If not, how do you explain the "deterministc behaviour" of that button?
I would put that down to subjective analysis. ;)
Ideally you should be running the JVM with 2-3x the minimum memory it needs for it to run efficiently. Trying to save a few 100 MB which cost less than $1 is not always useful. ;)
回答4:
You can kinda force GC like this....
private static void force_gc()
{
Object obj = new Object();
WeakReference<Object> ref = new WeakReference<Object>(obj);
obj = null;
while (ref.get() != null)
{
Log.d(LOGTAG, "Forcing gc() ...");
System.gc();
}
}
apart from that... i'm interested to see where this question goes.
来源:https://stackoverflow.com/questions/13305813/system-gc-vs-gc-button-in-jvisualvm-jconsole