问题
This question has been asked before:
Postgresql full text search in postgresql - japanese, chinese, arabic
but there are no answers for Chinese as far as I can see. I took a look at the OpenOffice wiki, and it doesn't have a dictionary for Chinese.
Edit: As we are already successfully using PG's internal FTS engine for English documents, we don't want to move to an external indexing engine. Basically, what I'm looking for is a Chinese FTS configuration, including parser and dictionaries for Simplified Chinese (Mandarin).
回答1:
I know it's an old question but there's a Postgres extension for Chinese: https://github.com/amutu/zhparser/
回答2:
I've just implemented a Chinese FTS solution in PostgreSQL. I did it by creating NGRAM tokens from Chinese input, and creating the necessary tsvector
s using an embedded function (in my case I used plpythonu
). It works very well (massively preferable to moving to SQL Server!!!).
回答3:
Index your data with Solr, it's an open source enterprise search server built on top of Lucene.
You can find more info on Solr here:
http://lucene.apache.org/solr/
A good book on how-to (with PDF download immediately) here:
https://www.packtpub.com/solr-1-4-enterprise-search-server/book
And be sure to use a Chinese tokenizer, such as solr.ChineseTokenizerFactory because Chinese is not whitespace delimited.
来源:https://stackoverflow.com/questions/3994504/how-do-i-implement-full-text-search-in-chinese-on-postgresql