问题
I have a list of 9 million IPs and, with a set of hash tables, I can make a constant-time function that returns if a particular IP is in that list. Can I do it in PHP? If so, how?
回答1:
The interesting thing about this question is the number of directions you can go.
I'm not sure if caching is your best option simply because of the large set of data and the relatively low number of queries on it. Here are a few ideas.
1) Build a ram disk. Link your mysql database table to use the ramdisk partition. I've never tried this, but it would be fun to try.
2) Linux generally has a very fast file system. Build a structured file system that breaks up the records into files, and just call file_get_contents() or file_exists(). Of course this solution would require you to build and maintain the file system, which would also be fun. rsync might be helpful to keep your live filesystem up to date.
Example:
/002/209/001/299.txt
<?
$file = $this->build_file_from_ip($_GET['ip']);
if(file_exists($file)) {
// Execute your code.
}
?>
回答2:
This to me sounds like an ideal application for a Bloom Filter. Have a look at the links provided which might help you get it done ASAP.
- http://github.com/mj/php-bloomfilter
- http://code.google.com/p/php-bloom-filter/
回答3:
I think throwing it in memcache would probably be your best/fastest method.
回答4:
If reading the file into sqlite would be an option you could benefit from indexes thus speeding up lookups?
Otherwise memcached is an option but i don't know how checking for existence would go if you do it with pure php lookups (rather slow my guess)
回答5:
Have you tried a NoSql solution like Redis? The entire data set is managed in memory.
Here are some benchmarks.
来源:https://stackoverflow.com/questions/1545826/is-there-a-way-to-maintain-a-200mb-immutable-data-structure-in-memory-and-access