I am doing some tests with escape analysis in Java 7 in order to better understand what objects are eligible to stack allocation.
Here is the code I wrote to test st
EA is something the C2 compiler analyses based on the IR it generates therefore you need it to compile the method before enjoying the benefits. Each test is called once only so there is no chance for it to compile. Details on EA and the C2 IR in the hotspot internals wiki (https://wiki.openjdk.java.net/display/HotSpot/Overview+of+Ideal,+C2%27s+high+level+intermediate+representation and https://wiki.openjdk.java.net/display/HotSpot/EscapeAnalysis)
Here's a version that attempts to show the impact
import com.sun.management.ThreadMXBean;
import java.lang.management.ManagementFactory;
import java.util.ArrayList;
import java.util.Iterator;
public class EscapeAnalysisTest {
private static final long TIME_TO_TEST = 10L * 1000L; // 10s
static class Timestamp {
private long millis;
public Timestamp(long millis) {
this.millis = millis;
}
public long getTime() {
return millis;
}
public void setTime(long time) {
millis = time;
}
}
public static void main(String[] args) {
System.out.println("****");
doIt();
System.out.println("****");
doIt();
System.out.println("****");
doIt();
System.out.println("****");
doIt();
System.out.println("****");
}
private static void doIt() {
final ThreadMXBean mxbean = (ThreadMXBean) ManagementFactory.getThreadMXBean();
final long tid = Thread.currentThread().getId();
long r = 0;
final long allocPre = mxbean.getThreadAllocatedBytes(tid);
r += test1();
long alloc1 = mxbean.getThreadAllocatedBytes(tid);
System.out.println("test1 - " + (alloc1 - allocPre));
r += test2();
final long alloc2 = mxbean.getThreadAllocatedBytes(tid);
System.out.println("test2 - " + (alloc2 - alloc1));
r += test3();
final long alloc3 = mxbean.getThreadAllocatedBytes(tid);
System.out.println("test3 - " + (alloc3 - alloc2));
r += test4();
final long alloc4 = mxbean.getThreadAllocatedBytes(tid);
System.out.println("test4 - " + (alloc4 - alloc3));
r += test5();
final long alloc5 = mxbean.getThreadAllocatedBytes(tid);
System.out.println("test5 - " + (alloc5 - alloc4));
r += test6();
final long alloc6 = mxbean.getThreadAllocatedBytes(tid);
System.out.println("test6 - " + (alloc6 - alloc5));
System.out.println(r);
}
public static long test1() {
long r = 0;
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < TIME_TO_TEST) {
r += new Timestamp(System.currentTimeMillis()).getTime();
}
return r;
}
public static long test2() {
ArrayList<Integer> l = new ArrayList<Integer>(1000);
for (int i = 0; i < 1000; ++i) {
l.add(i);
}
long r = 0;
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < TIME_TO_TEST) {
for (Iterator<Integer> it = l.iterator(); it.hasNext(); ) {
r += it.next().longValue();
}
}
return r;
}
public static long test3() {
long r = 0;
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < TIME_TO_TEST) {
Timestamp ts = new Timestamp(System.currentTimeMillis());
ts.setTime(42);
r += ts.getTime();
}
return r;
}
public static long test4() {
ArrayList<Integer> l = new ArrayList<Integer>(1000);
for (int i = 0; i < 1000; ++i) {
l.add(i);
}
long r = 0;
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < TIME_TO_TEST) {
Iterator<Integer> it = l.iterator();
r += it.next().longValue();
r += it.next().longValue();
r += it.next().longValue();
r += it.next().longValue();
}
return r;
}
public static long test5() {
ArrayList<Integer> l = new ArrayList<Integer>(1000);
for (int i = 0; i < 1000; ++i) {
l.add(i);
}
long r = 0;
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < TIME_TO_TEST) {
Iterator<Integer> it = l.iterator();
for (int i = 0; i < l.size(); ++i) {
r += it.next().longValue();
}
}
return r;
}
public static long test6() {
long r = 0;
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < TIME_TO_TEST) {
for (Timestamp ts = new Timestamp(System.currentTi());
ts.getTime() > 0;
ts.setTime(ts.getTime() + System.currentTimeMillis())) {
r += ts.getTime();
}
}
return r;
}
}
which generates the following output when run with -server -XX:CompileThreshold=1
****
test1 - 109048
test2 - 89243416
test3 - 16664
test4 - 42840
test5 - 71982168
test6 - 1400
-5351026995119026839
****
test1 - 16432
test2 - 85921464
test3 - 16664
test4 - 42840
test5 - 66777600
test6 - 1368
7844020592566674506
****
test1 - 48
test2 - 18256
test3 - 272
test4 - 18264
test5 - 18264
test6 - 272
-2137858376905291730
****
test1 - 48
test2 - 18256
test3 - 272
test4 - 18264
test5 - 18264
test6 - 272
3273987624143297143
****
one danger here is that the compilation of this method has changed it more fundamentally, I haven't attempted to guard against this so some use of LogCompilation
or PrintCompilation
might be required to check.
I just investigated the same thing, but for Java 8. I put my answer in a duplicate question as I didn't find this one in time.
Summary from the full answer:
First of all, it's implementation dependent. This answer applies to OpenJDK 1.8 and probably also the Oracle JVM 1.8.
Secondly, as others have stated, stack allocation only happens when a method is compiled by the C2 compiler, which only happens once a method has been called enough times.
If so, objects can be stack allocated if
The inlining especially is not predictable if you don't know some of the specific quirks of Hotspot. See the linked answer for some details.
Edit: I tried running your test on java 8 (OpenJDK), and everything is inlined there. So there are differences in stack allocation between java 7 and 8.
Escape Analysis heavily relies on inlining of function calls.
Like with any other microbenchmark - especially on the server VM - warmup is required. If you remove -XX:CompileThreshold=1
and execute your main test in a loop you will notice that after 1-2 iterations it will stop collecting garbage because the compiler gathered enough profiling information to inline the methods and then perform escape analysis.