Let\'s say that you have a list of 10,000 email addresses, and you\'d like to find what some of the closest \"neighbors\" in this list are - defined as email addresses that
10,000 email addresses sound not too much. For similarity search in a larger space you can use shingling and min-hashing. This algorithm is a bit more complicated to implement, but is much more efficient on a large space.
I don't think you can do better than O(n^2) but you can do some smaller optimizations which could be just enough of a speedup to make your application usable:
EDIT: Actually you can do better than O(n^2), just look at Nick Johnsons answer below.