Mysql fulltext search relevance across multiple tables

半世苍凉 提交于 2019-11-28 17:06:39

Yes, you can unify them very well using a search engine such as Apache Lucene and Solr.

http://lucene.apache.org/solr/

If you need to do it only in MySQL, you can do this with a UNION. You'll probably want to suppress any zero-relevant results.

You'll need to decide how you want to affect the relevance depending on which table matches.

For example, suppose you want articles to be most important, events to be medium important, and pages to be least important. You can use multipliers like this:

set @articles_multiplier=3;
set @events_multiplier=2;
set @pages_multiplier=1;

Here's a working example you can try that demonstrates some of these techniques:

Create sample data:

create database d;
use d;

create table articles (id int primary key, content text) ENGINE = MYISAM;
create table events (id int primary key, content text) ENGINE = MYISAM;
create table pages (id int primary key, content text) ENGINE = MYISAM;

insert into articles values 
(1, "Lorem ipsum dolor sit amet"),
(2, "consectetur adipisicing elit"),
(3, "sed do eiusmod tempor incididunt");

insert into events values 
(1, "Ut enim ad minim veniam"),
(2, "quis nostrud exercitation ullamco"),
(3, "laboris nisi ut aliquip");

insert into pages values 
(1, "Duis aute irure dolor in reprehenderit"),
(2, "in voluptate velit esse cillum"),
(3, "dolore eu fugiat nulla pariatur.");

Make it searchable:

ALTER TABLE articles ADD FULLTEXT(content);
ALTER TABLE events ADD FULLTEXT(content);
ALTER TABLE pages ADD FULLTEXT(content);

Use a UNION to search all these tables:

set @target='dolor';

SELECT * from (
  SELECT 
    'articles' as 'table_name', id, 
    @articles_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from articles
  UNION
  SELECT 
    'events' as 'table_name', 
    id,
    @events_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from events
  UNION
  SELECT 
    'pages' as 'table_name', 
    id, 
    @pages_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from pages
)
as sitewide WHERE relevance > 0;

The result:

+------------+----+------------------+
| table_name | id | relevance        |
+------------+----+------------------+
| articles   |  1 | 1.98799377679825 |
| pages      |  3 | 0.65545331108093 |
+------------+----+------------------+

(Sorry, I want to leave this as comment to the above answer, but I dont have enough reputation to comment)

Be aware that UNION in subqueries are very poorly optimized. A frequently case is when you want to paginate your results using "LIMIT @page * 10, 10" in the parent query, then MySQL must get all the results from the subqueries in order to evaluate the parent query.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!