问题
I have a function which takes two strings and gives out the cosine similarity value which shows the relationship between both texts.
If I want to compare 75 texts with each other, I need to make 5,625 single comparisons to have all texts compared with each other.
Is there a way to reduce this number of comparisons? For example sparse matrices or k-means?
I don't want to talk about my function or about ways to compare texts. Just about reducing the number of comparisons.
回答1:
What Ben says it's true, to get better help you need to tell us what's the goal.
For example, one possible optimization if you want to find similar strings is storing the string vectors in a spatial data structure such as a quadtree, where you can outright discard the vectors that are too far away from each other, avoiding many comparisons.
回答2:
If your algorithm is pair-wise, then you probably can't reduce the number of comparisons, by definition.
You'll need to use a different algorithm, or at the very least pre-process your input if you want to reduce the number of comparisons.
Without the details of your function, it's difficult to give any concrete help.
来源:https://stackoverflow.com/questions/1456343/speed-up-text-comparisons-with-sparse-matrices