问题
ConcurrentHashMap is often used in concurrent environments for aggregation of some events under a key - like counting hits for some string values. In case we don't know the keys in advance we need to have a good way to initialize key on need, it should to be fast and safe in terms of concurrency. What is the best pattern (in terms of efficiency) for this problem?
I will use a model map with <String, AtomicInteger>
declared like:
ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<>();
But it could be a map with any key-value pair, where we need to initialize a key-value pair if the key does not already exists in the map and mutate the state of value to record event.
There are two popular approaches:
The first one uses ConcurrentHashMap.putIfAbsent
:
AtomicInteger count = map.get(s);
if (count == null) {
count = new AtomicInteger(0);
AtomicInteger prevCount = map.putIfAbsent(s, count);
if (prevCount != null) {
count = prevCount;
}
}
count.incrementAndGet();
The second one uses ConcurrentHashMap.computeIfAbsent
:
AtomicInteger count = map.computeIfAbsent(s, (k) -> new AtomicInteger(0));
count.incrementAndGet();
Which one is better suited for this task? Are there other approaches?
回答1:
Unfortunatelly up to jdk1.8.0_131, the computeIfAbsent
always go into the synchronized
block, regardles if the key is already there or not, which makes it way slower than the putIfAbsent
.
This benchmark confirms this, it appears that depending on contention level putIfAbsent
is from 2 to 50 times faster than computeIfAbsent
.
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
public class ConcurrentHashMapTest {
private final static int numberOfRuns = 1000000;
private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
private final static int keysSize = 10;
private final static String[] strings = new String[keysSize];
static {
for (int n = 0; n < keysSize; n++) {
strings[n] = "" + (char) ('A' + n);
}
}
public static void main(String[] args) throws InterruptedException {
for (int n = 0; n < 20; n++) {
testPutIfAbsent();
testComputeIfAbsentLambda();
}
}
private static void testPutIfAbsent() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.get(s);
if (count == null) {
count = new AtomicInteger(0);
AtomicInteger prevCount = map.putIfAbsent(s, count);
if (prevCount != null) {
count = prevCount;
}
}
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
private static void testComputeIfAbsentLambda() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.computeIfAbsent(s, (k) -> new AtomicInteger(0));
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
}
the output:
Test testPutIfAbsent average time per run: 115.756501 ns
Test testComputeIfAbsentLambda average time per run: 276.9667055 ns
Test testPutIfAbsent average time per run: 134.2332435 ns
Test testComputeIfAbsentLambda average time per run: 223.222063625 ns
Test testPutIfAbsent average time per run: 119.968893625 ns
Test testComputeIfAbsentLambda average time per run: 216.707419875 ns
Test testPutIfAbsent average time per run: 116.173902375 ns
Test testComputeIfAbsentLambda average time per run: 215.632467375 ns
Test testPutIfAbsent average time per run: 112.21422775 ns
Test testComputeIfAbsentLambda average time per run: 210.29563725 ns
Test testPutIfAbsent average time per run: 120.50643475 ns
Test testComputeIfAbsentLambda average time per run: 200.79536475 ns
We can use the putIfAbsent
approach to create faster computeIfAbsent
.
The only difference would be that this new computeIfAbsent
could call the initialization function
more than once in case of concurrent initialization of the same key. The benchmark results are identical as with the 'putIfAbsent', since it is the same code, it is not a big surprise, but in case anyone would like to test this, here is the benchmark:
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
import java.util.function.Function;
public class CocnurrentHashMap2Benchmark {
private final static int numberOfRuns = 1000000;
private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
private final static int keysSize = 10;
private final static String[] strings = new String[keysSize];
static {
for (int n = 0; n < keysSize; n++) {
strings[n] = "" + (char) ('A' + n);
}
}
public static void main(String[] args) throws InterruptedException {
for (int n = 0; n < 20; n++) {
testPutIfAbsent();
testComputeIfAbsent2Lambda();
}
}
private static void testPutIfAbsent() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.get(s);
if (count == null) {
count = new AtomicInteger(0);
AtomicInteger prevCount = map.putIfAbsent(s, count);
if (prevCount != null) {
count = prevCount;
}
}
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
private static void testComputeIfAbsent2Lambda() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap2<String, AtomicInteger> map = new ConcurrentHashMap2<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.computeIfAbsent2(s, (k) -> new AtomicInteger(0));
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
public static class ConcurrentHashMap2<K,V> extends ConcurrentHashMap<K,V> {
/**
* If there is no mapping for the key then computes and puts the mapping,
* otherwise it simply return the value for that key.
* In case of concurrent initialization of the same key the mappingFunction can be called more than once.
* @param key - the key to be initialized or retrieved
* @param mappingFunction - the function to be called for computation of initial value.
* @return computed value if the key wasn't already in the map otherwise return the actual value for provided key.
*/
public V computeIfAbsent2(K key, Function<K,V> mappingFunction) {
V value = get(key);
if (value == null) {
value = mappingFunction.apply(key);
V prevValue = putIfAbsent(key, value);
if (prevValue != null) {
value = prevValue;
}
}
return value;
}
}
}
the results:
Test testComputeIfAbsent2Lambda average time per run: 138.1053415 ns
Test testPutIfAbsent average time per run: 129.45236425 ns
Test testComputeIfAbsent2Lambda average time per run: 128.48006825 ns
Test testPutIfAbsent average time per run: 118.733798375 ns
Test testComputeIfAbsent2Lambda average time per run: 134.038046625 ns
Test testPutIfAbsent average time per run: 119.7947695 ns
Test testComputeIfAbsent2Lambda average time per run: 134.183876375 ns
Test testPutIfAbsent average time per run: 137.969932625 ns
Test testComputeIfAbsent2Lambda average time per run: 137.97531275 ns
Test testPutIfAbsent average time per run: 136.904379125 ns
Test testComputeIfAbsent2Lambda average time per run: 148.899750125 ns
Test testPutIfAbsent average time per run: 129.788293125 ns
Test testComputeIfAbsent2Lambda average time per run: 141.50586625 ns
Test testPutIfAbsent average time per run: 129.081558875 ns
Test testComputeIfAbsent2Lambda average time per run: 122.36628625 ns
Test testPutIfAbsent average time per run: 127.1215535 ns
Test testComputeIfAbsent2Lambda average time per run: 108.129917625 ns
Test testPutIfAbsent average time per run: 133.630786875 ns
Test testComputeIfAbsent2Lambda average time per run: 134.978805625 ns
Test testPutIfAbsent average time per run: 132.7747585 ns
Test testComputeIfAbsent2Lambda average time per run: 132.4352885 ns
Test testPutIfAbsent average time per run: 133.753792875 ns
Test testComputeIfAbsent2Lambda average time per run: 134.09569175 ns
Test testPutIfAbsent average time per run: 145.610141125 ns
Test testComputeIfAbsent2Lambda average time per run: 139.437622125 ns
If we compare the speed of putting mapping into the map when the key doesn't already exists it appears that the new 'computeIfAbsent2' is also much faster. The benchmark:
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
import java.util.function.Function;
public class CocnurrentHashMap2PutBenchmark {
private final static int numberOfRuns = 1000000;
private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
public static void main(String[] args) throws InterruptedException {
for (int n = 0; n < 20; n++) {
testComputeIfAbsent2();
testComputeIfAbsent();
}
}
private static void testComputeIfAbsent2() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap2<Integer, String> map = new ConcurrentHashMap2<Integer, String>();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
Integer key = Integer.valueOf(n);
start = System.nanoTime();
String value = map.computeIfAbsent2(key, (k) -> "value");
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
private static void testComputeIfAbsent() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<Integer, String>();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
@Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
Integer key = Integer.valueOf(n);
start = System.nanoTime();
String value = map.computeIfAbsent(key, (k) -> "value");
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
public static class ConcurrentHashMap2<K, V> extends ConcurrentHashMap<K, V> {
/**
* If there is no mapping for the key then computes and puts the
* mapping, otherwise it simply return the value for that key. In case
* of concurrent initialization of the same key the mappingFunction can
* be called more than once.
*
* @param key
* - the key to be initialized or retrieved
* @param mappingFunction
* - the function to be called for computation of initial
* value.
* @return computed value if the key wasn't already in the map otherwise
* return the actual value for provided key.
*/
public V computeIfAbsent2(K key, Function<K, V> mappingFunction) {
V value = get(key);
if (value == null) {
value = mappingFunction.apply(key);
V prevValue = putIfAbsent(key, value);
if (prevValue != null) {
value = prevValue;
}
}
return value;
}
}
}
the results:
Test testComputeIfAbsent2 average time per run: 445.077932375 ns
Test testComputeIfAbsent average time per run: 784.786391 ns
Test testComputeIfAbsent2 average time per run: 294.10136375 ns
Test testComputeIfAbsent average time per run: 314.8724765 ns
Test testComputeIfAbsent2 average time per run: 236.56533275 ns
Test testComputeIfAbsent average time per run: 350.863664625 ns
Test testComputeIfAbsent2 average time per run: 346.19498275 ns
Test testComputeIfAbsent average time per run: 641.995172625 ns
Test testComputeIfAbsent2 average time per run: 255.441646125 ns
Test testComputeIfAbsent average time per run: 326.399150125 ns
Test testComputeIfAbsent2 average time per run: 275.626666125 ns
Test testComputeIfAbsent average time per run: 201.207314125 ns
Test testComputeIfAbsent2 average time per run: 289.19059725 ns
Test testComputeIfAbsent average time per run: 318.448059 ns
Test testComputeIfAbsent2 average time per run: 225.19701825 ns
Test testComputeIfAbsent average time per run: 306.461814125 ns
Test testComputeIfAbsent2 average time per run: 213.460366 ns
Test testComputeIfAbsent average time per run: 334.325044625 ns
Test testComputeIfAbsent2 average time per run: 256.4048955 ns
Test testComputeIfAbsent average time per run: 256.366700625 ns
Test testComputeIfAbsent2 average time per run: 231.88875575 ns
Test testComputeIfAbsent average time per run: 246.076624 ns
Test testComputeIfAbsent2 average time per run: 222.4649485 ns
Test testComputeIfAbsent average time per run: 266.505719625 ns
Test testComputeIfAbsent2 average time per run: 228.708391375 ns
Test testComputeIfAbsent average time per run: 261.866442625 ns
Test testComputeIfAbsent2 average time per run: 198.614718875 ns
Test testComputeIfAbsent average time per run: 225.43031925 ns
Test testComputeIfAbsent2 average time per run: 300.478359 ns
Test testComputeIfAbsent average time per run: 306.03640225 ns
Test testComputeIfAbsent2 average time per run: 195.0444215 ns
Test testComputeIfAbsent average time per run: 271.461982625 ns
Test testComputeIfAbsent2 average time per run: 224.306529875 ns
Test testComputeIfAbsent average time per run: 334.52790425 ns
Test testComputeIfAbsent2 average time per run: 212.217131625 ns
Test testComputeIfAbsent average time per run: 184.541579125 ns
Test testComputeIfAbsent2 average time per run: 265.417909625 ns
Test testComputeIfAbsent average time per run: 213.9811425 ns
Test testComputeIfAbsent2 average time per run: 298.76602575 ns
Test testComputeIfAbsent average time per run: 347.883728125 ns
来源:https://stackoverflow.com/questions/44969543/fastest-way-to-initialize-a-value-for-concurrenthashmap