I have a large number of sets of numbers. Each set contains 10 numbers and I need to remove all sets that have 5 or more number (unordered) matches with any other set.
F
You don't say much about what the range of numbers that might appear are, but I have two ideas:
an inverted list that maps a number that appears in the lists to the lists that contain it, then intersect those lists to find those that have more than one number in common.
divide the numbers or group them into ranges of "close" numbers, then refine (narrow) the lists that have numbers appear in those ranges. You reduce the ranges for matching lists you have a manageable number of lists and you can compare the lists exactly . This would be a "proximity" approach I think.