I actually think you should consider SQL Server 2008. Store the data in a table with a varbinary(max) column, along with a column that contains the hash of that column. Index the hash, as you suggested.
You'll then be able to use the various distribution features of the product.