How do i implement tag searching? with lucene?

后端 未结 2 1321
独厮守ぢ
独厮守ぢ 2020-12-08 01:18

I havent used lucene. Last time i ask (many months ago, maybe a year) people suggested lucene. If i shouldnt use lucene what should i use? As am example say there are items

相关标签:
2条回答
  • 2020-12-08 01:54

    Edit: You can use Lucene. Here's an explanation how to do this in Lucene.net. Some Lucene basics are:

    • Document - is the storage unit in Lucene. It is somewhat analogous to a database record.
    • Field - the search unit in Lucene. Analogous to a database column. Lucene searches for text by taking a query and matching it against fields. A field should be indexed in order to enable search.
    • Token - the search atom in Lucene. Usually a word, sometimes a phrase, letter or digit.
    • Analyzer - the part of Lucene that transforms a field into tokens.

    Please read this blog post about creating and using a Lucene.net index.

    I assume you are tagging blog posts. If I am totally wrong, please say so. In order to search for tags, you need to represent them as Lucene entities, namely as tokens inside a "tags" field.

    One way of doing so, is assigning a Lucene document per blog post. The document will have at least the following fields:

    • id: unique id of the blog post.
    • content: the text of the blog post.
    • tags: list of tags.

    Indexing: Whenever you add a tag to a post, remove a tag or edit it, you will need to index the post. The Analyzer will transform the fields into their token representation.

    Document doc = new Document();
    doc.Add(new Field("id", i.ToString(), Field.Store.YES, Field.Index.NO));
    doc.Add(new Field("content", text, Field.Store.YES, Field.Index.TOKENIZED));
    doc.Add(new Field("tags", tags, Field.Store.YES, Field.Index.TOKENIZED));
    writer.AddDocument(doc);
    

    The remaining part is retrieval. For this, you need to create a QueryParser and pass it a query string, like this:

    QueryParser qp = new QueryParser();
    Query q = qp.Parse(s);
    Hits = Searcher.Search(q);
    

    The syntax you need for s will be:

    tags: apples tags: carrots
    

    To search for apples or carrots

    tags: carrots NOT tags: apples
    

    See the Lucene Query Parser Syntax for details on constructing s.

    0 讨论(0)
  • 2020-12-08 02:02

    Lucene for .net seems to be mature. No need to use Java or SOLR

    The Standard query language for Lucene allows equally ranked search terms and negation

    So if your Lucene index had a field "tag" your query would be

    tag:apple* OR tag: carrot*
    

    Which would give equal ranking to each word, and more rank weighting to document with both tags

    To negate a tag use this

    tag:carrot* NOT tag:apple*
    

    Simple example to show indexing and querying with Lucene here

    0 讨论(0)
提交回复
热议问题