Never used a cache like this before. The problem is that I want to load 500,000 + records out of a database and do some selecting/filtering wicked fast.
I\'m thinkin
Choose a cache which complies to JSR 107 which will make your job easy when you want to migrate from one implementation to the other. To be specific on the question go for Ehcache which is more popular and widely used Java caching solution. We are using Ehcache extensively and it works for us.
They're both pretty solid projects. If you have pretty basic caching needs, either one of them will probably work as well as the other.
You may also wish to consider doing the filtering in a database query if it's feasible. Often, using a tuned query that returns a smaller result set will give you better performance than loading 500,000 rows into memory and then filtering them.
I've used JCS (http://jakarta.apache.org/jcs/) and it seems solid and easy to use programatically.
Either way, I recommend using them with Spring Modules. The cache can be transparent to the application, and cache implementations are trivially easy to swap. In addition to OSCache and EHCache, Spring Modules also support Gigaspaces and JBoss cache.
As to comparisons.... OSCache is easier to configure EHCache has more configuration options
They are both rock solid, both support mirroring cache, both work with Terracotta, both support in-memory and to-disk caching.
I have used oscache on several spring projects with spring-modules, using the aop based configuration.
Recently I looked to use oscache + spring modules on a Spring 3.x project, but found spring-modules annotation-based caching is not supported (even by the fork).
I recently found out about this project -
http://code.google.com/p/ehcache-spring-annotations/
Which supports spring 3.x with declarative annotation-based caching using ehcache.
Other answers discuss pros/cons for caches; but I am wondering whether you actually benefit from cache at all. It is not quite clear exactly what you plan on doing here, and why a cache would be beneficial: if you have the data set at your use, just access that. Cache only helps reuse things between otherwise independent tasks. If this is what you are doing, yes, caching can help. But if it is a big task that can carry along its data set, caching would add no value.