Compound FULLTEXT index in MySQL

前端 未结 2 926
失恋的感觉
失恋的感觉 2021-01-20 15:02

I would like to make system whitch allows to search user messages, by specific user. assume having folowing table

create table messages(
  user_id int,
  mes         


        
相关标签:
2条回答
  • 2021-01-20 15:21

    @Alden Quimby's answer is correct as far as it goes, but there is more to the story, because MySQL will only try to choose the optimal index, and its ability to make that determination is limited because of the way fulltext indexes interact with the optimizer.

    What actually happens is this:

    If the specified user_id exists in either 0 or 1 matching rows in the table, the optimizer will realize this and will choose user_id as the index for that query. Fast execution.

    Otherwise, the optimizer will choose the fulltext index, filtering every row matched by the fulltext index to eliminate rows not containing a user_id that matches the WHERE clause. Not quite as fast.

    So it's not truly the "optimum" path. It's more like fulltext, with a nice optimization to avoid the fulltext search under the one condition that we know we have almost nothing of interest in the table.

    The reason this breaks down is that a fulltext index doesn't give any meaningful statistics back to the optimizer. It just says "yeah, I think that query should probably only require me to check 1 row" ... which, of course, pleases the optimizer greatly, so the fulltext index wins the bid for lowest cost, unless the index with the integer value also comes in comparably low or lower.

    Still, that doesn't mean I wouldn't try it this way first.

    There's another option, which would work best with fulltext queries IN BOOLEAN MODE and that is to create another column which you would populate with something like CONCAT('user_id_',user_id) or something similar, and then declare a 2-column fulltext index.

    filter_string VARCHAR(48) # populated with CONCAT('user_id_',user_id);
    ....
    FULLTEXT KEY (message,filter_string)
    

    Then specify everything in the query.

    SELECT ...
     WHERE user_id = 500 AND
     MATCH (message,filter_string) AGAINST ('+kittens +puppies +user_id_500' IN BOOLEAN MODE);
    

    Now, the fulltext index will be responsible for matching only those rows where kittens, puppies, and "user_id_500" appears in the combined fulltext index of the two columns, but you'd still want to have the integer filter there too to make sure the final results are constrained in spite of any random appearance of "user_id_500" in the message.

    0 讨论(0)
  • 2021-01-20 15:25

    You should add a fulltext index on message and a regular index on user_id, and use the query:

    SELECT *
    FROM messages
    WHERE MATCH(message) AGAINST(@search_query)
    AND user_id = @user_id;
    

    You're right that you can't do option 3. But rather than trying to pick between 1 and 2, let MySQL do the work for you. MySQL will only use one of the two indexes, and will do a linear scan to complete the second filter, but it will estimate the effectiveness of each index and choose the optimal one.

    Note: only do this if you can afford the overhead of two indexes (slower insert/update/delete). Also, if you know that each user will only have a few messages, then yes it might make sense to use a simple index and do a regex in the application layer or something like that.

    0 讨论(0)
提交回复
热议问题