I need a way to do key-value lookups across (potentially) hundreds of GB of data. Ideally something based on a distributed hashtable, that works nicely with Java. It should be
Distributed hash tables include Tapestry, Chord, and Pastry. One of these should suit your needs.