efficient serverside autocomplete

后端 未结 3 434
醉梦人生
醉梦人生 2021-02-01 10:10

First off all I know:

Premature optimization is the root of all evil

But I think wrong autocomplete can really blow up your site.

相关标签:
3条回答
  • 2021-02-01 10:43

    Using SQL versus Solr's terms component is really not a comparison. At their core they solve the problem the same way by making an index and then making simple calls to it.

    What i would want to know is "what you are trying to auto complete".

    Ultimately, the easiest and most surefire way to scale a system is to make a simple solution and then just scale the system by replicating data. Trying to cache calls or predict results just make things complicated, and don't get to the root of the problem (ie you can only take them so far, like if each request missed the cache).

    Perhaps a little more info about how your data is structured and how you want to see it extracted would be helpful.

    0 讨论(0)
  • 2021-02-01 10:50

    Optimising for Auto-complete

    Unfortunately, the resolution of this issue will depend heavily on the data you are hoping to query.

    LIKE queries will not put too much strain on your database, as long as you spend time using 'EXPLAIN' or the profiler to show you how the query optimiser plans to perform your query.

    Some basics to keep in mind:

    • Indexes: Ensure that you have indexes setup. (Yes, in many cases LIKE does use the indexes. There is an excellent article on the topic at myitforum. SQL Performance - Indexes and the LIKE clause ).

    • Joins: Ensure your JOINs are in place and are optimized by the query planner. SQL Server Profiler can help with this. Look out for full index or full table scans

    Auto-complete sub-sets

    Auto-complete queries are a special case, in that they usually works as ever decreasing sub sets.

    • 'name' LIKE 'a%' (may return 10000 records)
    • 'name' LIKE 'al%' (may return 500 records)
    • 'name' LIKE 'ala%' (may return 75 records)
    • 'name' LIKE 'alan%' (may return 20 records)

    If you return the entire resultset for query 1 then there is no need to hit the database again for the following result sets as they are a sub set of your original query.

    Depending on your data, this may open a further opportunity for optimisation.

    0 讨论(0)
  • 2021-02-01 10:53

    I will no comply with your requirements and obviously the numbers of scale will depend on hardware, size of the DB, architecture of the app, and several other items. You must test it yourself.

    But I will tell you the method I've used with success:

    • Use a simple SQL like for example: SELECT name FROM users WHERE name LIKE al%. but use TOP 100 to limit the number of results.
    • Cache the results and maintain a list of terms that are cached
    • When a new request comes in, first check in the list if you have the term (or part of the term cached).
    • Keep in mind that your cached results are limited, some you may need to do a SQL query if the term remains valid at the end of the result (I mean valid if the latest result match with the term.

    Hope it helps.

    0 讨论(0)
提交回复
热议问题