There are number of performance tips made obsolete by Java compiler and especially Profile-guided optimization. For example, these platform-provided optimizations can drasti
When using x64 JVM with RAM less than 32GB:
64bit JVM use 30%-50% more memory in comparision to 32bit JVM because of bigger ordinary object pointers. You can heavily reduce this factor by using JDK6+.
From JDK6u6p to JDK6u22 it is optional and can be enabled by adding JVM argument:
-XX:+UseCompressedOops
From JDK6u23 (JDK7 also) it is enabled by default. More info here.
People replacing String a = "this" + var1 + " is " + var2;
with multiple calls to StringBuilder or StringBuffer. It actually already uses StringBuilder behind the scenes.
In 2001 I made apps for a J2ME phone. It was the size of a brick. And very nearly the computational power of a brick.
Making Java apps run acceptably on it required writing them in as procedural fashion as possible. Furthermore, the very large performance improvement was to catch the ArrayIndexOutOfBoundsException
to exit for-loops over all items in a vector. Think about that!
Even on Android there are 'fast' loops through all items in an array and 'slow' ways of writing the same thing, as mentioned in the Google IO videos on dalvik VM internals.
However, in answer to your question, I would say that it is most unusual to have to micro-optimise this kind of thing these days, and I'd further expect that on a JIT VM (even the new Android 2.2 VM, which adds JIT) these optimisations are moot. In 2001 the phone ran KVM interpreter at 33MHz. Now it runs dalvik - a much faster VM than KVM - at 500MHz to 1500MHz, with a much faster ARM architecture (better processor even allowing for clock speed gains) with L1 e.t.c. and JIT arrives.
We are not yet in the realms where I'd be comfortable doing direct pixel manipulation in Java - either on-phone or on the desktop with an i7 - so there are still normal every-day code that Java isn't fast enough for. Here's an interesting blog that claims an expert has said that Java is 80% of C++ speed for some heavy CPU task; I am sceptical, I write image manipulation code and I see an order of magnitude between Java and native for loops over pixels. Maybe I'm missing some trick...? :D
It is necessary to define time/memory trade-offs before starting the performance optimization. This is how I do it for my memory/time critical application (repeating some answers above, to be complete):