I am looking for a simple in-memory (and in-process) cache for short-term caching of query data (but short-term meaning beyond request/response, i.e. session boundary). EhCache
It's possible to define a meaningful measure for the memory usage of a cache. You could compute the : "retained size". Unfortunately computing the retained size is roughly as costly as a full GC, and it's therefore probably not an option. In certain JVM languages (clojure?) you could theoretically make sure that no objects in the cache would be referenced from outside objects and then you could monitor the real size of the cache.
How about using a simple LinkedHashMap with LRU algorithm enabled and put all data with a SoftReference in it... such as cache.out(key, new SoftReference(value)) ??
This would limit your cache to the amount of available memory but not kill the rest of your programm, because Java removes the soft references when there is a memory demand... not all.. the oldest first... usually. If you add a reference queue to your implementation, you can also remove the stall entries (only key, no value) from the map.
This would free you from calculating the size of the entries and keeping track of the sum.
It's not just hard to measure - it's hard to define.
Suppose two cache entries refer to the same string - do they both count the size of that string, despite the fact that removing either of them from the cache wouldn't make the string eligible for garbage collection? Do neither of them count the size, despite the fact that if both of them are removed from the cache the string may then be eligible for collection? What about if another object not in the cache has a reference to that string?
If you can accurately describe the size you're interested in it may be possible to ascertain it programmatically - but I suspect you'll find it's hard even to decide exactly what you want.
As well as guessing the memory usage of the object, for a reasonable algorithm you would also need to guess the cost of recreating it. A reasonable guess would be the cost of recreation is roughly proportional to memory size. So the factors cancel each other out and you need neither. A simple algorithm is probably going to work out better.
If you cannot make any estimations - write a cache eviction policy which flushes based on the JVM heap size (polled from System) or triggered by a finalize()-call from an orphaned object (on GC).
The thing that does this job is java.lang.ref.SoftReference . Typically, you extend the SoftReference class so that the subclass contains the key.