I\'ll use a university\'s library system to explain my use case. Students register in the library system and provide their profile: gender, age, department, previously completed
Any time we are looking at large data-sets (which this question is about ... whether or not Drools is a good fit in a large data set case), think outside the box (below). Any time we are talking about "millions of objects" or similar log-N type problems, I don't think they tool in question is necessarily the problem. So yes, Drools (or JBoss Rules) can be used BUT would only make sense in a certain context...
When you have log-N of anything (cross-referencing large data-sets against inputs), I would recommend using more novel approaches like database-backed Bloom Filters. These can be implemented as Java objects and referenced by Drools for the fact lookup (custom coding there however).
Since Bloom Filters are tiny memory structures with only basic insert()/contains() functions, they do have a drawback ... about a 1% false-positive rate. So this will serve as a primary-cache. If constructing the Drools question to generally be "NO" as the answer, a Bloom Filter backed fact-table construct lookup will be lightning fast and with a tiny memory footprint (about 1.1 bytes per record in my implementation) so 1 MB of RAM for this case. Then in the "contains" case (which might be a false-positive), use the database-backed fact table to clarify. Again, if in 80% of the time, the lookup is false, then the Bloom Filter will be a huge cost-savings in memory and time. Otherwise, the pure (anything - Drools facts, database, etc) 1M record lookups will be very expensive every time (in memory and speed).