I am trying to look for a minhash open source implementation which I can leverage for my work.
The functionality I need is very simple, given a set as input, the impleme
I would suggest you this library, especially if you need persistence. Here, you can use redis to store/retrieve all your data.
You have the option to select a redis database, or to simply use built-in in-memory python dictionaries.
Performances using redis, at least if redis server is running on your local machine, are almost identical to those achieved via standard python dictionaries.
You only need to specify a config dictionary such as
config = {"redis": {"host": 'localhost', "port": '6739', "db": 0}}
and pass it as an argument to the LSHash
class constructor.