I need a way to do key-value lookups across (potentially) hundreds of GB of data. Ideally something based on a distributed hashtable, that works nicely with Java. It should be
Try distributed Map structure from Redisson, it based on Redis server. Using Redis cluster configuration you may split data across 1000 servers.
Usage example:
Redisson redisson = Redisson.create();
ConcurrentMap map = redisson.getMap("anyMap");
map.put("123", new SomeObject());
map.putIfAbsent("323", new SomeObject());
map.remove("123");
...
redisson.shutdown();