问题
I am testing performance for PostgreSQL full text search (using pg_search gem) and solr (sunspot_solr gem).
For 4 million records I am getting 13456 ms for Tsearch and 800 ms with SOLR (that is SOLR query + DB retrival). It is obvious that I need index but I am not sure how to create one for full text search. I investigated and found that for full text search I should use GIN index.
execute "CREATE INDEX products_gin_title ON products USING GIN(to_tsvector('english', title))"
But I am searching via two more columns and I need multi-value index and I am not sure how to implement it? I am not very familiar with DB part. My search code looks like:
@results = Product.search_title(params[:search_term]).where("platform_id=? AND product_type=?", params[:platform_id], params[:type_id]).limit(10).all
How do I create proper query for this type of situations?
Here is SQL output from rails for search term car.
Product Load (12494.0ms)
SELECT
"products".*,
( ts_rank((to_tsvector('simple', coalesce("products"."title"::text, ''))), (to_ tsquery('simple', ''' ' || 'car' || ' ''')), 2) ) AS pg_search_rank
FROM "products"
WHERE (((to_tsvector('simple', coalesce("products"."tit le"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'car' || ' '''))))
AND (platform_id='26' AND product_type='2')
ORDER BY pg_search_rank DESC, "products"."id" ASC
LIMIT 10
EDIT:
I am using PostgreSQL 8.4.11, EXPLAIN ANALYZE
output is following.
Limit (cost=108126.34..108126.36 rows=10 width=3824) (actual time=12228.736..12228.738 rows=10 loops=1)
-> Sort (cost=108126.34..108163.84 rows=14999 width=3824) (actual time=12228.733..12228.734 rows=10 loops=1)
Sort Key: (ts_rank(to_tsvector('simple'::regconfig, COALESCE((title)::text, ''::text)), '''car'''::tsquery, 2)), id
Sort Method: top-N heapsort Memory: 18kB
-> Seq Scan on products (cost=0.00..107802.22 rows=14999 width=3824) (actual time=7.532..12224.585 rows=977 loops=1)
Filter: ((platform_id = 26) AND (product_type = 2) AND (to_tsvector('simple'::regconfig, COALESCE((title)::text, ''::text)) @@ '''car'''::tsquery))
Total runtime: 12228.813 ms
回答1:
This expression:
to_tsvector('simple', (COALESCE(title::TEXT), ''))
is not sargable against your index.
You should declare the index on the exactly that expression which is used in the query:
CREATE INDEX products_gin_title
ON products
USING GIN(to_tsvector('simple', COALESCE(title::TEXT,'')))
(or make ruby generate the expression which is used in the index).
If you want multiple columns to be indexed, just concatenate them:
CREATE INDEX products_gin_title
ON products
USING GIN(to_tsvector('simple', title || ' ' || product_type || ' ' || platform_id))
but again, Ruby should be filtering on exactly same expression for the index to be of use.
来源:https://stackoverflow.com/questions/10880327/proper-full-text-index-rails-postgresql-pg-search