Warning: the question is a little long, but the part below the separation line is for curiosity only.
Oracle\'s JDK 7 implementation of AtomicInteger includes t
I'll give new supposition. If we look into byte code of AtomicInteger
we will see, that the main difference between them is that addAndGet
uses iload_
instruction, and incrementAndGet
uses iconst_
instruction:
public final int addAndGet(int);
...
4: istore_2
5: iload_2
6: iload_1
7: iadd
public final int incrementAndGet();
...
4: istore_1
5: iload_1
6: iconst_1
7: iadd
It seems, that iconst_
+iadd
translates as INC
instruction, due to iload_
...iadd
as ADD
instruction. This all relates to commonly known question about ADD 1
vs INC
and so on:
Relative performance of x86 inc vs. add instruction
Is ADD 1 really faster than INC ? x86
This could be the answer, why addAndGet
is slightly faster than incrementAndGet
To expand on @AlexeiKaigorodov's answer, if this is real Java code, it would be faster because it would eliminate an extra frame on the call stack. This makes it run quicker (why not?) and may have implications that multiple concurrent calls to the loop are less likely to fail, causing the loop to run repeatedly. (Though, I can't come up with any such reasons off the top of my head.)
Though, through your micro-benchmarks, it's possible that the code is not real and the incrementAndGet()
method is implemented in native code in the way you specified, or that both are just intrinsic instructions (delegated to lock:xadd
on x86 for example). However, it's generally rather hard to divine what the JVM is doing all the time, and there could be other things that are causing this.
Out of curiosity, here is the assembly code generated by the JIT. In summary, the main difference is:
incrementAndGet
mov r8d,eax
inc r8d ;*iadd
addAndGet
mov r9d,r8d
add r9d,eax ;*iadd
The rest of the code is essentially the same. That confirms that:
INC
vs ADD 1
I am not good enough at reading assembly to know why that makes a difference. And that does not really answer my initial question.
Full listing (incrementAndGet):
# {method} 'incrementAndGet' '()I' in 'java/util/concurrent/atomic/AtomicInteger'
# [sp+0x20] (sp of caller)
0x00000000026804c0: mov r10d,DWORD PTR [rdx+0x8]
0x00000000026804c4: shl r10,0x3
0x00000000026804c8: cmp rax,r10
0x00000000026804cb: jne 0x0000000002657b60 ; {runtime_call}
0x00000000026804d1: data32 xchg ax,ax
0x00000000026804d4: nop DWORD PTR [rax+rax*1+0x0]
0x00000000026804dc: data32 data32 xchg ax,ax
[Verified Entry Point]
0x00000000026804e0: sub rsp,0x18
0x00000000026804e7: mov QWORD PTR [rsp+0x10],rbp ;*synchronization entry
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@-1 (line 204)
0x00000000026804ec: mov eax,DWORD PTR [rdx+0xc] ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@12 (line 206)
0x00000000026804ef: mov r8d,eax
0x00000000026804f2: inc r8d ;*iadd
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@7 (line 205)
0x00000000026804f5: lock cmpxchg DWORD PTR [rdx+0xc],r8d
0x00000000026804fb: sete r11b
0x00000000026804ff: movzx r11d,r11b ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@12 (line 206)
0x0000000002680503: test r11d,r11d
0x0000000002680506: je 0x0000000002680520 ;*iload_2
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@18 (line 207)
0x0000000002680508: mov eax,r8d
0x000000000268050b: add rsp,0x10
0x000000000268050f: pop rbp
0x0000000002680510: test DWORD PTR [rip+0xfffffffffdbafaea],eax # 0x0000000000230000
; {poll_return}
0x0000000002680516: ret
0x0000000002680517: nop WORD PTR [rax+rax*1+0x0] ; OopMap{rdx=Oop off=96}
;*goto
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@20 (line 208)
0x0000000002680520: test DWORD PTR [rip+0xfffffffffdbafada],eax # 0x0000000000230000
;*goto
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@20 (line 208)
; {poll}
0x0000000002680526: mov r11d,DWORD PTR [rdx+0xc] ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@12 (line 206)
0x000000000268052a: mov r8d,r11d
0x000000000268052d: inc r8d ;*iadd
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@7 (line 205)
0x0000000002680530: mov eax,r11d
0x0000000002680533: lock cmpxchg DWORD PTR [rdx+0xc],r8d
0x0000000002680539: sete r11b
0x000000000268053d: movzx r11d,r11b ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@12 (line 206)
0x0000000002680541: test r11d,r11d
0x0000000002680544: je 0x0000000002680520 ;*ifeq
; - java.util.concurrent.atomic.AtomicInteger::incrementAndGet@15 (line 206)
0x0000000002680546: jmp 0x0000000002680508
Full listing (addAndGet):
# {method} 'addAndGet' '(I)I' in 'java/util/concurrent/atomic/AtomicInteger'
# this: rdx:rdx = 'java/util/concurrent/atomic/AtomicInteger'
# parm0: r8 = int
# [sp+0x20] (sp of caller)
0x0000000002680d00: mov r10d,DWORD PTR [rdx+0x8]
0x0000000002680d04: shl r10,0x3
0x0000000002680d08: cmp rax,r10
0x0000000002680d0b: jne 0x0000000002657b60 ; {runtime_call}
0x0000000002680d11: data32 xchg ax,ax
0x0000000002680d14: nop DWORD PTR [rax+rax*1+0x0]
0x0000000002680d1c: data32 data32 xchg ax,ax
[Verified Entry Point]
0x0000000002680d20: sub rsp,0x18
0x0000000002680d27: mov QWORD PTR [rsp+0x10],rbp ;*synchronization entry
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@-1 (line 233)
0x0000000002680d2c: mov eax,DWORD PTR [rdx+0xc] ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@12 (line 235)
0x0000000002680d2f: mov r9d,r8d
0x0000000002680d32: add r9d,eax ;*iadd
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@7 (line 234)
0x0000000002680d35: lock cmpxchg DWORD PTR [rdx+0xc],r9d
0x0000000002680d3b: sete r11b
0x0000000002680d3f: movzx r11d,r11b ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@12 (line 235)
0x0000000002680d43: test r11d,r11d
0x0000000002680d46: je 0x0000000002680d60 ;*iload_3
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@18 (line 236)
0x0000000002680d48: mov eax,r9d
0x0000000002680d4b: add rsp,0x10
0x0000000002680d4f: pop rbp
0x0000000002680d50: test DWORD PTR [rip+0xfffffffffdbaf2aa],eax # 0x0000000000230000
; {poll_return}
0x0000000002680d56: ret
0x0000000002680d57: nop WORD PTR [rax+rax*1+0x0] ; OopMap{rdx=Oop off=96}
;*goto
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@20 (line 237)
0x0000000002680d60: test DWORD PTR [rip+0xfffffffffdbaf29a],eax # 0x0000000000230000
;*goto
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@20 (line 237)
; {poll}
0x0000000002680d66: mov r11d,DWORD PTR [rdx+0xc] ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@12 (line 235)
0x0000000002680d6a: mov r9d,r11d
0x0000000002680d6d: add r9d,r8d ;*iadd
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@7 (line 234)
0x0000000002680d70: mov eax,r11d
0x0000000002680d73: lock cmpxchg DWORD PTR [rdx+0xc],r9d
0x0000000002680d79: sete r11b
0x0000000002680d7d: movzx r11d,r11b ;*invokevirtual compareAndSwapInt
; - java.util.concurrent.atomic.AtomicInteger::compareAndSet@9 (line 135)
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@12 (line 235)
0x0000000002680d81: test r11d,r11d
0x0000000002680d84: je 0x0000000002680d60 ;*ifeq
; - java.util.concurrent.atomic.AtomicInteger::addAndGet@15 (line 235)
0x0000000002680d86: jmp 0x0000000002680d48
The reason is that they preferred to make code faster, in expense of code size.
I am sure, the sources are real. If they were intrinsics, they would be marked as native.
Just to complete the discussion, the same question was also asked in the Concurrency-interest -- Discussion list for JSR-166 mailing list almost at the same time as here.
Here is the start of the thread - [concurrency-interest] AtomicInteger implementation discussing the AtomicInteger implementation.