I need to gather some statistics in my software and i am trying to make it fast and correct, which is not easy (for me!)
first my code so far with two classes, a StatsS
A different approach to the problem is to exploit the (trivial) thread safety via thread confinement. Basically create a single background thread that takes care of both reading and writing. It has a pretty good characteristics in terms of scalability and simplicity.
The idea is that instead of all the threads trying to update the data directly, they produce an "update" task for the background thread to process. The same thread can also do the read task, assuming some lags in processing updates is tolerable.
This design is pretty nice because the threads will no longer have to compete for a lock to update data, and since the map is confined to a single thread you can simply use a plain HashMap to do get/put, etc. In terms of implementation, it would mean creating a single threaded executor, and submitting write tasks which may also perform the optional "collectAndSave" operation.
A sketch of code may look like the following:
public class StatsService {
private ExecutorService executor = Executors.newSingleThreadExecutor();
private final Map stats = new HashMap();
public void notify(final String key) {
Runnable r = new Runnable() {
public void run() {
Long value = stats.get(key);
if (value == null) {
value = 1L;
} else {
value++;
}
stats.put(key, value);
// do the optional collectAndSave periodically
if (timeToDoCollectAndSave()) {
collectAndSave();
}
}
};
executor.execute(r);
}
}
There is a BlockingQueue associated with an executor, and each thread that produces a task for the StatsService uses the BlockingQueue. The key point is this: the locking duration for this operation should be much shorter than the locking duration in the original code, so the contention should be much less. Overall it should result in a much better throughput and latency.
Another benefit is that since only one thread reads and writes to the map, plain HashMap and primitive long type can be used (no ConcurrentHashMap or atomic types involved). This also simplifies the code that actually processes it a great deal.
Hope it helps.