Is Java\'s System.arraycopy()
efficient for small arrays, or does the fact that it\'s a native method make it likely to be substantially less efficient than a s
Expanding a little on what Sid has written, it's very likely that System.arraycopy
is just a JIT intrinsic; meaning that when code calls System.arraycopy
, it will most probably be calling a JIT-specific implementation (once the JIT tags System.arraycopy
as being "hot") that is not executed through the JNI interface, so it doesn't incur the normal overhead of native methods.
In general, executing native methods does have some overhead (going through the JNI interface, also some internal JVM operations cannot happen when native methods are being executed). But it's not because a method is marked as "native" that you're actually executing it using JNI. The JIT can do some crazy things.
Easiest way to check is, as has been suggested, writing a small benchmark, being careful with the normal caveats of Java microbenchmarks (warm up the code first, avoid code with no side-effects since the JIT just optimizes it as a no-op, etc).
Instead of relying on speculation and possibly outdated information, I ran some benchmarks using caliper. In fact, Caliper comes with some examples, including a CopyArrayBenchmark that measures exactly this question! All you have to do is run
mvn exec:java -Dexec.mainClass=com.google.caliper.runner.CaliperMain -Dexec.args=examples.CopyArrayBenchmark
My results are based on Oracle's Java HotSpot(TM) 64-Bit Server VM, 1.8.0_31-b13, running on a mid-2010 MacBook Pro (macOS 10.11.6 with an Intel Arrandale i7, 8 GiB RAM). I don't believe that it's useful to post the raw timing data. Rather, I'll summarize the conclusions with the supporting visualizations.
In summary:
for
loop to copy each element into a newly instantiated array is never advantageous, even for arrays as short as 5 elements.Arrays.copyOf(array, array.length)
and array.clone()
, but not quite consistently so. (See the case for 50000 int
s.) Because of that, and the verbosity of the call, I would recommend System.arraycopy()
if you need fine control over which elements get copied where.Here are the timing plots:
System.arraycopy
use a memmove
operation for moving words and assembly for moving other primitive types in C behind the scene. So it makes its best effort to move as much as efficient it can reach.
Native functions should be faster than JVM functions, since there is no VM overhead. However for a lot of(>1000) very small(len<10) arrays it might be slower.
Here is my benchmark code:
public void test(int copySize, int copyCount, int testRep) {
System.out.println("Copy size = " + copySize);
System.out.println("Copy count = " + copyCount);
System.out.println();
for (int i = testRep; i > 0; --i) {
copy(copySize, copyCount);
loop(copySize, copyCount);
}
System.out.println();
}
public void copy(int copySize, int copyCount) {
int[] src = newSrc(copySize + 1);
int[] dst = new int[copySize + 1];
long begin = System.nanoTime();
for (int count = copyCount; count > 0; --count) {
System.arraycopy(src, 1, dst, 0, copySize);
dst[copySize] = src[copySize] + 1;
System.arraycopy(dst, 0, src, 0, copySize);
src[copySize] = dst[copySize];
}
long end = System.nanoTime();
System.out.println("Arraycopy: " + (end - begin) / 1e9 + " s");
}
public void loop(int copySize, int copyCount) {
int[] src = newSrc(copySize + 1);
int[] dst = new int[copySize + 1];
long begin = System.nanoTime();
for (int count = copyCount; count > 0; --count) {
for (int i = copySize - 1; i >= 0; --i) {
dst[i] = src[i + 1];
}
dst[copySize] = src[copySize] + 1;
for (int i = copySize - 1; i >= 0; --i) {
src[i] = dst[i];
}
src[copySize] = dst[copySize];
}
long end = System.nanoTime();
System.out.println("Man. loop: " + (end - begin) / 1e9 + " s");
}
public int[] newSrc(int arraySize) {
int[] src = new int[arraySize];
for (int i = arraySize - 1; i >= 0; --i) {
src[i] = i;
}
return src;
}
From my tests, calling test()
with copyCount
= 10000000 (1e7) or greater allows the warm-up to be achieved during the first copy/loop
call, so using testRep
= 5 is enough; With copyCount
= 1000000 (1e6) the warm-up need at least 2 or 3 iterations so testRep
shall be increased in order to obtain usable results.
With my configuration (CPU Intel Core 2 Duo E8500 @ 3.16GHz, Java SE 1.6.0_35-b10 and Eclipse 3.7.2) it appears from the benchmark that:
copySize
= 24, System.arraycopy()
and the manual loop take almost the same time (sometimes one is very slightly faster than the other, other times it’s the contrary),copySize
< 24, the manual loop is faster than System.arraycopy()
(slightly faster with copySize
= 23, really faster with copySize
< 5),copySize
> 24, System.arraycopy()
is faster than the manual loop (slightly faster with copySize
= 25, the ratio loop-time/arraycopy-time increasing as copySize
increases).Note: I’m not English native speaker, please excuse all my grammar/vocabulary errors.
Byte codes are executed natively anyways so it's likely that performance would be better than a loop.
So in case of a loop it would have to execute byte codes which will incur an overhead. While array copy should be straight memcopy.