I need a way to do key-value lookups across (potentially) hundreds of GB of data. Ideally something based on a distributed hashtable, that works nicely with Java. It should be
Try distributed Map structure from Redisson, it based on Redis server. Using Redis cluster configuration you may split data across 1000 servers.
Usage example:
Redisson redisson = Redisson.create();
ConcurrentMap<String, SomeObject> map = redisson.getMap("anyMap");
map.put("123", new SomeObject());
map.putIfAbsent("323", new SomeObject());
map.remove("123");
...
redisson.shutdown();
Depending on the use case, Terracotta may be just what you need.
Open Chord is an implementation of the CHORD protocol in Java. It is a distributed hash table protocol that should fit your needs perfectly.
OpenChord sounds promising; but i'd also consider BDB, or any other non-SQL hashtable, making it distributed can be dead-easy (if the number of storage nodes is (almost) constant, at least), just hash the key on the client to get the appropriate server.