Fuzzy queries to database

时光总嘲笑我的痴心妄想 提交于 2020-01-06 19:32:35

问题


I'm curious about how works feature on many social sites today.

For example, you enter list of movies you like and system suggests other movies you may like (based on movies that like other people who likes the same movies that you). I think doing it straight-sql way (join list of my movies with movies-users join with user-movies group by movie title and apply count to it ) on large datasets would be just impossible to implement due to "heaviness" of such query.

At the same time we don't need exact solution, approximate would be enough. I wonder is there way to implement something like fuzzy query to traditional RDBMS that would be fast to execute but has some infelicity. Or how such features implemented on real systems.


回答1:


that's collaborative filtering, or recommendation

unless you need something really complex the slope one predictor is one of the more simple ones it's like 50 lines of python, Bryan O’Sullivan’s Collaborative filtering made easy, the paper by Daniel Lemire et al. introducing "Slope One Predictors for Online Rating-Based Collaborative Filtering"

this one has a way of updating just one user at a time when they change without in some cases for others that need to reprocess the whole database just to update

i used that python code to do predict the word count of words not occurring in documents but i ran into memory issues and such and i think i might write an out of memory version maybe using sqlite

also the matrix used in that one is triangular the sides along the diagonal are mirrored so only one half of the matrix needs to be stored




回答2:


The term you are looking for is "collaborative filtering"

Read Programming Collective Intelligence, by O'Reilly Press




回答3:


The simplest methods use Bayesian networks. There are libraries that can take care of most of the math for you.



来源:https://stackoverflow.com/questions/4017950/fuzzy-queries-to-database

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!