MySQL sort by number of occurrences

前端 未结 3 1467
遥遥无期
遥遥无期 2021-02-09 15:28

I am doing a search in two text fields called Subject and Text for a specific keyword. To do this I use the LIKE statement. I have encount

相关标签:
3条回答
  • 2021-02-09 15:45
    // escape $keyword for mysql
    $keyword = strtolower('Keyword');
    // now build the query
    $query = <<<SQL
        SELECT *,
        ((LENGTH(`Subject`) - LENGTH(REPLACE(LOWER(`Subject`), '{$keyword}', ''))) / LENGTH('{$keyword}')) AS `CountInSubject`,
        ((LENGTH(`Text`) - LENGTH(REPLACE(LOWER(`Text`), '{$keyword}', ''))) / LENGTH('{$keyword}')) AS `CountInText`
        FROM `News`
        WHERE (`Text` LIKE '%{$keyword}%' OR `Subject` LIKE '%{$keyword}%')
        ORDER BY (`CountInSubject` + `CountInText`) DESC;
    SQL;
    

    Returns number of occurrences in each field and sorts by that.

    The 'keyword' needs to be lower cased for this to work. I don't think it's really fast, performance wise as it needs to lower-case fields and there's no case-insensitive search in MySQL afaik.

    You could index each news item (subject and text) by words and store in another table with news_id and occurrence count and then match against that.

    0 讨论(0)
  • 2021-02-09 15:53

    Below query can give you the no.of occurrences of string appears in both columns i.e text and subject and will sort results by the criteria but this will not be a good solution performance wise its better to sort the results in your application code level

    SELECT *,
    (LENGTH(`Text`) - LENGTH(REPLACE(`Text`, 'Keyword', ''))) / LENGTH('Keyword')
    +
    (LENGTH(`Subject`) - LENGTH(REPLACE(`Subject`, 'Keyword', ''))) / LENGTH('Keyword') `occurences`
     FROM 
    `Table`
     WHERE (Text LIKE '%Keyword%' OR Subject LIKE '%Keyword%')
    ORDER BY `occurences`  DESC
    

    Fiddle Demo

    Suggested by @lserni a more cleaner way of calculation of occurrences

    SELECT *,
    (LENGTH(`Text`) - LENGTH(REPLACE(`Text`, 'test', ''))) / LENGTH('test') `appears_in_text`,
    
    (LENGTH(`Subject`) - LENGTH(REPLACE(`Subject`, 'test', ''))) / LENGTH('test') `appears_in_subject`,
    
    (LENGTH(CONCAT(`Text`,' ',`Subject`)) - LENGTH(REPLACE(CONCAT(`Text`,' ',`Subject`), 'test', ''))) / LENGTH('test') `occurences`
     FROM 
    `Table1`
     WHERE (TEXT LIKE '%test%' OR SUBJECT LIKE '%test%')
    ORDER BY `occurences`  DESC
    

    Fiddle Demo 2

    0 讨论(0)
  • 2021-02-09 16:09

    You want SUM instead. Count will count how many records have non-null values, which means ALL matches and NON-matches will be counted.

    SELECT *, SUM(Text LIKE '%Keyword') AS total_matches
    ...
    ORDER BY total_matches
    

    SUM() will count up how many boolean true results the LIKE produces, which will be typecast to integers, so you get a result like 1+1+1+0+1 = 4, instead of the 5 non-nulls count.

    0 讨论(0)
提交回复
热议问题