I use a large (millions) entries hashmap to cache values needed by an algorithm, the key is a combination of two objects as a long. Since it grows continuously (because keys in
For a memory-aware cache, you may want to use Apache Commons collections, in particular their org.apache.commons.collections.map.ReferenceMap
class. The Java special operation is a soft reference. Java provides WeakHashMap
for weak references, but weak references are not what you want for a cache. Java does not provide a SoftHashMap
, but ReferenceMap
from Apache Commons can be a workable substitute.
Memory awareness of soft references is somewhat crude and inflexible. You can play with some Java options to somehow configure them, especially the -XX:SoftRefLRUPolicyMSPerMB
value, which expresses (in milliseconds) how long soft-referenced values are kept in memory (when they cease to be directly reachable). For instance, with this:
java -XX:SoftRefLRUPolicyMSPerMB=2500
then the JVM will try to keep cached value for 2.5 seconds more than what it would have done with a WeakHashMap
.
If soft references do not provide what you are looking for, then you will have to implement your own cache strategy, and, indeed, flush the map manually. This is your initial question. For flushing, you can use the clear()
method, or simply create a new HashMap
. The difference should be slight, and you may even have trouble simply measuring that difference.
Alternating between "full cache" and "empty cache" may also be considered as a bit crude, so you could maintain several maps. For instance, you maintain ten maps. When you look for a cached value, you look in all maps, but when you had a value, you put it in the first map only. When you want to flush, you rotate the maps: the first map becomes the second, the second becomes the third, and so on, up to the tenth map which is discarded. A new fresh first map is created. This would look like this:
import java.util.*;
public class Cache {
private static final int MAX_SIZE = 500000;
private Map[] backend;
private int size = 0;
public Cache(int n)
{
backend = new Map[n];
for (int i = 0; i < n; i ++)
backend[i] = new HashMap();
}
public int size()
{
return size;
}
public Object get(Object key)
{
for (Map m : backend) {
if (m.containsKey(key))
return m.get(key);
}
return null;
}
public Object put(Object key, Object value)
{
if (backend[0].containsKey(key))
return backend[0].put(key, value);
int n = backend.length;
for (int i = 1; i < n; i ++) {
Map m = backend[i];
if (m.containsKey(key)) {
Object old = m.remove(key);
backend[0].put(key, value);
return old;
}
}
backend[0].put(key, value);
size ++;
while (size > MAX_SIZE) {
size -= backend[n - 1].size();
System.arraycopy(backend, 0, backend, 1, n - 1);
backend[0] = new HashMap();
}
return null;
}
}
The code above is completely untested, and should be enhanced with generics. However, it illustrates the main ideas: all maps are tested when reading (get()
), all new values go to the first map, the total size is maintained, and when the size exceeds a given limit, maps are rotated. Note that there is some special treatment when a new value is put for a known key. Also, in this version, nothing special is done when finding a cached value, but we could "rejuvenate" accessed cached value: upon get()
, when a value is found but not in the first map, it could be moved into the first map. Thus, frequently accessed values would remain cached forever.