SQL Server Full-Text Search for exact match with fallback

后端 未结 4 797
滥情空心
滥情空心 2021-01-11 12:29

First off there seems to be no way to get an exact match using a full-text search. This seems to be a highly discussed issue when using the full-text search method and there

相关标签:
4条回答
  • 2021-01-11 12:36

    You should use full text search CONTAINSTABLE to find the top 100 (possibly 200) candidate results and then order the results you found using your own criteria.

    It sounds like you'd like to ORDER BY

    1. exact match of the phrase (=)
    2. the fully matched phrase (LIKE)
    3. higher value for the Popularity column
    4. the Rank from the CONTAINSTABLE

    But you can toy around with the exact order you prefer.

    In SQL that looks something like:

    DECLARE @title varchar(255)
    SET @title = '"Toy Story"'
    --need to remove quotes from parameter for LIKE search
    DECLARE @title2 varchar(255)
    SET @title2 = REPLACE(@title, '"', '')
    
    SELECT
        m.ID,
        m.title,
        m.Popularity,
        k.Rank
    FROM Movies m
    INNER JOIN CONTAINSTABLE(Movies, title, @title, 100) as [k]
        ON m.ID = k.[Key]
    ORDER BY 
      CASE WHEN m.title = @title2 THEN 0 ELSE 1 END,
      CASE WHEN m.title LIKE @title2 THEN 0 ELSE 1 END,
      m.popularity desc,
      k.rank
    

    See SQLFiddle

    0 讨论(0)
  • 2021-01-11 12:42

    This will give you the movies that contain the exact phrase "Toy Story", ordered by their popularity.

    SELECT
        m.[ID],
        m.[Popularity],
        k.[Rank]
    FROM [dbo].[Movies] m
    INNER JOIN CONTAINSTABLE([dbo].[Movies], [Title], N'"Toy Story"') as [k]
        ON m.[ID] = k.[Key]
    ORDER BY m.[Popularity]
    

    Note the above would also give you "The Goonies Return" if you searched "The Goonies".

    0 讨论(0)
  • 2021-01-11 12:42

    In Oracle I've used UTL_MATCH for similar purposes. (http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/u_match.htm)

    Even though using the Jaro Winkler algorithm, for instance, might take awhile if you compare the title column from table 1 and table 2, you can improve performance if you partially join the 2 tables. I have in some cases compared person names on table 1 with table 2 using Jaro Winkler, but limited results not just above a certain Jaro Winkler threshold, but also to names between the 2 tables where the first letter is the same. For instance I would compare Albert with Aden, Alfonzo, and Alberto, using Jaro Winkler, but not Albert and Frank (limiting the number of situations where the algorithm needs to be used).

    Jaro Winkler may actually be suitable for movie titles as well. Although you are using SQL server (can't use the utl_match package) it looks like there is a free library called "SimMetrics" which has the Jaro Winkler algorithm among other string comparison metrics. You can find detail on that and instructions here: http://anastasiosyal.com/POST/2009/01/11/18.ASPX?#simmetrics

    0 讨论(0)
  • 2021-01-11 12:51

    If got the feeling you don't really like the fuzzy part of the full text search but you do like the performance part.

    Maybe is this a path: if you insist on getting the EXACT match before a weighted match you could try to hash the value. For example 'Toy Story' -> bring to lowercase -> toy story -> Hash into 4de2gs5sa (with whatever hash you like) and perform a search on the hash.

    0 讨论(0)
提交回复
热议问题