Speed up text comparisons (with sparse matrices)

ぃ、小莉子 提交于 2019-12-23 05:41:07

问题


I have a function which takes two strings and gives out the cosine similarity value which shows the relationship between both texts.

If I want to compare 75 texts with each other, I need to make 5,625 single comparisons to have all texts compared with each other.

Is there a way to reduce this number of comparisons? For example sparse matrices or k-means?

I don't want to talk about my function or about ways to compare texts. Just about reducing the number of comparisons.


回答1:


What Ben says it's true, to get better help you need to tell us what's the goal.

For example, one possible optimization if you want to find similar strings is storing the string vectors in a spatial data structure such as a quadtree, where you can outright discard the vectors that are too far away from each other, avoiding many comparisons.




回答2:


If your algorithm is pair-wise, then you probably can't reduce the number of comparisons, by definition.

You'll need to use a different algorithm, or at the very least pre-process your input if you want to reduce the number of comparisons.

Without the details of your function, it's difficult to give any concrete help.



来源:https://stackoverflow.com/questions/1456343/speed-up-text-comparisons-with-sparse-matrices

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!