How to match rows with one or more words in query, but without any words not in query?

问题

I have a table in a MySQL database that has a list of comma separated tags in it.

I want users to be able to enter a list of comma separated tags and then use Sphinx or MySQL to select rows that have at least one of the tags in the query but not any tags the query doesn't have.

The query can have additional tags that are not in the rows, but the rows should not be matched if they have tags not in the query.

I either want to use Sphinx or MySQL to do the searching.

Here's an example:

creatures:
----------------------------
| name |  tags             |
----------------------------
| cat  | wily,hairy        |
| dog  | cute,hairy        |
| fly  | ugly              |
| bear | grumpy,hungry     |
----------------------------

Example searches:

wily,hairy         <-- should match cat
cute,hairy,happy   <-- should match dog
happy,cute         <-- no match (dog has hairy)
ugly,yuck,gross    <-- should match fly
hairy              <-- no match (dog has cute cat has wily)
grumpy             <-- no match (bear has hungry)
grumpy,hungry      <-- should match bear
wily,grumpy,hungry <-- should match bear

Is it possible to do this with Sphinx or MySQL?

To reiterate, the query will be a list of comma separated tags and rows that have at least one of the entered tags but not any tags the query doesn't have should be selected.

回答1:

Sphinx expression ranker should be able to do this.

sphinxQL> SELECT *, WEIGHT() AS w FROM index 
   WHERE MATCH('@tags "cute hairy happy"/1') AND w > 0 
   OPTION ranker=expr('IF(word_count>=tags_len,1,0)');

basically you want the number of matched tags never to be less than the number of tags.

Note these just gives all documents a weight of 1, if want to get more elaborate ranking (eg to match other keywords) it gets more complicated.

You need index_field_lengths enabled on the index to get the tags_len attribute.

(the same concept is obviouslly possible in mysql.. probably using FIND_IN_SET to do matching. And either a second column to store the number, or compute the number of tags, using say the REPLACE function)

Edit to add, details about multiple fields...

sphinxQL> SELECT *, WEIGHT() AS w FROM index 
   WHERE MATCH('@tags "cute hairy happy"/1 @tags2 "one two thee"/1') AND w = 2 
   OPTION ranker=expr('SUM(IF(word_count>=IF(user_weight=2,tags2_len,tags_len),1,0))'), 
    field_weights=(tags=1,tags2=2);

The SUM function is run for each field in turn, so need to use the user_weight system to get be able to distinguish which field currently enumerating.

来源：https://stackoverflow.com/questions/28767577/how-to-match-rows-with-one-or-more-words-in-query-but-without-any-words-not-in

标签

sphinx