问题
I've been trying to measure the performance of the System.arrayCopy vs Arrays.copyOf in order to choose properly one of them. Just for the sake of benchmark I added manual copy as well and the result surprised me. Obviously I'm missing something really important, could you, please, tell me, what it is? The implementation is as follows (see first 4 methods).
public class ArrayCopy {
public static int[] createArray( int size ) {
int[] array = new int[size];
Random r = new Random();
for ( int i = 0; i < size; i++ ) {
array[i] = r.nextInt();
}
return array;
}
public static int[] copyByArraysCopyOf( int[] array, int size ) {
return Arrays.copyOf( array, array.length + size );
}
public static int[] copyByEnlarge( int[] array, int size ) {
return enlarge( array, size );
}
public static int[] copyManually( int[] array, int size ) {
int[] newArray = new int[array.length + size];
for ( int i = 0; i < array.length; i++ ) {
newArray[i] = array[i];
}
return newArray;
}
private static void copyArray( int[] source, int[] target ) {
System.arraycopy( source, 0, target, 0, Math.min( source.length, target.length ) );
}
private static int[] enlarge( int[] orig, int size ) {
int[] newArray = new int[orig.length + size];
copyArray( orig, newArray );
return newArray;
}
public static void main( String... args ) {
int[] array = createArray( 1000000 );
int runs = 1000;
int size = 1000000;
System.out.println( "****************** warm up #1 ******************" );
warmup( ArrayCopy::copyByArraysCopyOf, array, size, runs );
warmup( ArrayCopy::copyByEnlarge, array, size, runs );
warmup( ArrayCopy::copyManually, array, size, runs );
System.out.println( "****************** warm up #2 ******************" );
warmup( ArrayCopy::copyByArraysCopyOf, array, size, runs );
warmup( ArrayCopy::copyByEnlarge, array, size, runs );
warmup( ArrayCopy::copyManually, array, size, runs );
System.out.println( "********************* test *********************" );
System.out.print( "copyByArrayCopyOf" );
runTest( ArrayCopy::copyByArraysCopyOf, array, size, runs );
System.out.print( "copyByEnlarge" );
runTest( ArrayCopy::copyByEnlarge, array, size, runs );
System.out.print( "copyManually" );
runTest( ArrayCopy::copyManually, array, size, runs );
}
private static void warmup( BiConsumer<int[], Integer> consumer, int[] array, int size, int runs ) {
for ( int i = 0; i < runs; i++ ) {
consumer.accept( array, size );
}
}
private static void runTest( BiConsumer<int[], Integer> consumer, int[] array, int size, int runs ) {
ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
long currentCpuTime = threadMXBean.getCurrentThreadCpuTime();
long nanoTime = System.nanoTime();
for ( int i = 0; i < runs; i++ ) {
consumer.accept( array, size );
}
System.out.println( "-time = " + ( ( System.nanoTime() - nanoTime ) / 10E6 ) + " ms. CPU time = " + ( ( threadMXBean.getCurrentThreadCpuTime() - currentCpuTime ) / 10E6 ) + " ms" );
}
}
The result shows that manual copy performed around 30% better, as shown below:
****************** warm up #1 ******************
****************** warm up #2 ******************
********************* test *********************
copyByArrayCopyOf-time = 162.470107 ms. CPU time = 153.125 ms
copyByEnlarge-time = 168.6757949 ms. CPU time = 164.0625 ms
copyManually-time = 116.3975962 ms. CPU time = 110.9375 ms
I'm really confused, because I thought (and probably I still do) that System.arrayCopy
due to its nativity is the best possible way to copy an array, but I cannot explain this result.
回答1:
Actually, HotSpot compiler is smart enough to unroll and vectorize manual copy loop - that's why the result code appears to be well optimized.
Why is System.arraycopy
slower then? It is originally a native method, and you have to pay for a native call until the compiler optimizes it as JVM intrinsic.
However, in your test the compiler does not have a chance for such optimization, because enlarge
method is not called many enough times (i.e. it is not considered as hot).
I'll show you a funny trick to force the optimization. Rewrite enlarge
method as follows:
private static int[] enlarge(int[] array, int size) {
for (int i = 0; i < 10000; i++) { /* fool the JIT */ }
int[] newArray = new int[array.length + size];
System.arraycopy(array, 0, newArray, 0, array.length);
return newArray;
}
An empty loop triggers a backedge counter overflow, which in turn triggers the compilation of enlarge
method. The empty loop is then eliminated from the compiled code, so it is harmless. Now enlarge
method gets about 1.5x faster than the manual loop!
It is important that System.arraycopy
immediately follows new int[]
. In this case HotSpot can optimize away the redundant zeroing of newly allocated array. You know, all Java objects must be zeroed right after creation. But as far as compiler detects that the array is filled right after creation, it may eliminate zeroing, thus making the result code yet faster.
P.S. @assylias' benchmark is good, but it also suffers from the fact that System.arraycopy
is not intrinsified for the large arrays. In case of small arrays arrayCopy
benchmark is called many times per second, JIT considers it hot and optimizes well. But for large arrays each iteration is longer, so there is a lot less iterations per second, and JIT does not treat arrayCopy
as hot.
回答2:
Using jmh, I get the results shown in the table below (size is the size of the array, score is the time in microseconds, error shows the confidence interval at 99.9%):
Benchmark (size) Mode Cnt Score Error Units
ArrayCopy.arrayCopy 10 avgt 60 0.022 ± 0.001 us/op
ArrayCopy.arrayCopy 10000 avgt 60 4.959 ± 0.068 us/op
ArrayCopy.arrayCopy 10000000 avgt 60 11906.870 ± 220.850 us/op
ArrayCopy.clone_ 10 avgt 60 0.022 ± 0.001 us/op
ArrayCopy.clone_ 10000 avgt 60 4.956 ± 0.068 us/op
ArrayCopy.clone_ 10000000 avgt 60 10895.856 ± 208.369 us/op
ArrayCopy.copyOf 10 avgt 60 0.022 ± 0.001 us/op
ArrayCopy.copyOf 10000 avgt 60 4.958 ± 0.072 us/op
ArrayCopy.copyOf 10000000 avgt 60 11837.139 ± 220.452 us/op
ArrayCopy.loop 10 avgt 60 0.036 ± 0.001 us/op
ArrayCopy.loop 10000 avgt 60 5.872 ± 0.095 us/op
ArrayCopy.loop 10000000 avgt 60 11315.482 ± 217.348 us/op
In substance, loop seems to perform slightly better than arrayCopy for large arrays indeed - probably because the JIT is quite good at optimising such a simple loop. For smaller arrays, arrayCopy seems better (although the difference is quite small).
Note however that clone seems to be consistently as good as, or better than, the other options, depending on size. So I would go for clone, which also happens to be easier to use.
For reference, benchmark code, run with -wi 5 -w 1000ms -i 30 -r 1000ms -t 1 -f 2 -tu us
:
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
public class ArrayCopy {
@Param({"10", "10000", "10000000"}) int size;
private int[] array;
@Setup(Level.Invocation) public void setup() {
array = new int[size];
for (int i = 0; i < size; i++) {
array[i] = i;
}
}
@Benchmark
public int[] clone_() {
int[] copy = array.clone();
return copy;
}
@Benchmark
public int[] arrayCopy() {
int[] copy = new int[array.length];
System.arraycopy(array, 0, copy, 0, array.length);
return copy;
}
@Benchmark
public int[] copyOf() {
int[] copy = Arrays.copyOf(array, array.length);
return copy;
}
@Benchmark
public int[] loop() {
int[] copy = new int[array.length];
for (int i = 0; i < array.length; i++) {
copy[i] = array[i];
}
return copy;
}
}
来源:https://stackoverflow.com/questions/40286540/system-arraycopy-is-slow