full-text-indexing

Forgot to close the Lucene IndexWriter after adding Documents to the index

可紊 提交于 2019-12-10 09:27:13
问题 I had a program running for 2 days to build a Lucene index for around 160 million text files, and after the program ended, I tried searching the index and found the index was not correctly built, indexReader.numDocs() returned 0. I checked the index directory, it looked good, all the index data seemed to be there, the directory is 1.5 Gigabytes in size. I checked my code and found that I forgot to call indexWriter.optimize() and indexWriter.close(), I want to know if it is possible to re

How should I do full-text searching on App Engine?

淺唱寂寞╮ 提交于 2019-12-09 10:53:28
问题 What should I do for fast, full-text searching on App Engine with as little work as possible (and as little Java — I’m doing Python.)? 回答1: I have used Whoosh with appengine in one of my recent project and it seems to work fine. Have a look at https://github.com/tallstreet/Whoosh-AppEngine 回答2: GAE has announced plans to offer full-text searching natively in the Datastore soon. 来源: https://stackoverflow.com/questions/4130813/how-should-i-do-full-text-searching-on-app-engine

Problems creating a full text index on a view

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-08 15:55:55
问题 I have a view which has been created like this: CREATE VIEW [dbo].[vwData] WITH SCHEMABINDING AS SELECT [DataField1] , [DataField2] , [DataField3] FROM dbo.tblData When I try to create a full text index on it, like this: CREATE FULLTEXT INDEX ON [dbo].[vwData]( [DataField] LANGUAGE [English]) KEY INDEX [idx_DataField]ON ([ft_cat_Server], FILEGROUP [PRIMARY]) WITH (CHANGE_TRACKING = AUTO, STOPLIST = SYSTEM) I get this error: View 'dbo.vwData' is not an indexed view. Full-text index is not

Solr “real time” get - How to include 'text' field?

你。 提交于 2019-12-08 12:25:28
问题 Is it possible to retrieve the "text" field when performing a "real time" get ? When I perfom a /get request the returned json does not contain the content of the 'text' field. When I perform a search (/select request) the returned json does contain the content of the 'text' field. Here is an example where the id is 123: The search request http://localhost:8984/solr/real/select?q=id:123 returns: { "responseHeader":{ "zkConnected":true, "status":0, "QTime":4, "params":{ "q":"id:123"}},

Mixed queries against full-text index

為{幸葍}努か 提交于 2019-12-07 18:23:18
问题 I am using SQL Server 2012. A table has a text and a date column. The text column has a full-text index. The query issues CONTAINS against the full-text column but it also needs to include a greater-than condition on the date column. I am concerned about performance of SQL Server merging results from b-tree and full-text indices. In Oracle, the performance aspect of this scenario is addressed by including "normal" columns (that are not subject to full-text search) into a full-text index

Should I use Lucene.Net for full text search with SQL Compact Edition 4, or is there a better option?

女生的网名这么多〃 提交于 2019-12-07 07:47:46
问题 I'm trying to create a full text search facility for a small blog which is running against a SQL Compact Edition 4 database. There seems to be almost no information out there about this (though I'd be happy if someone can prove me wrong), but as far as I can gather, SQL CE doesn't support the normal SQL Server full-text indexing. I have briefly looked into using Lucene.Net, but it seems quite complex at first glance; would this be my best option here, or is there a simpler solution which I'm

How can I do indexing .html files in SOLR

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 07:37:04
问题 The files I want to do indexing is stored on the server(I don't need to crawl). /path/to/files/ the sample HTML file is <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta name="product_id" content="11"/> <meta name="assetid" content="10001"/> <meta name="title" content="title of the article"/> <meta name="type" content="0xyzb"/> <meta name="category" content="article category"/> <meta name="first" content="details of the article"/> <h4>title of the article</h4> <p class

MongoDB full text search - matching words and exact phrases

与世无争的帅哥 提交于 2019-12-06 14:29:38
问题 I'm currently having some issues with the full text search functionality in MongoDB. Specifically when trying to match exact phrases. I'm testing out the functionality in the mongo shell, but ultimately I'll be using Spring Data MongoDB with Java. So I first tried running this command to search for the words "delay", "late" and the phrase "on time" db.mycollection.find( { $text: { $search: "delay late \"on time\"" } }).explain(true); And the resulting explain query told me: "parsedTextQuery"

How can I set up Solr to tokenize on whitespace and punctuation?

三世轮回 提交于 2019-12-06 09:16:53
I have been trying to get my Solr schema (using Solr 1.3.0) to create terms that are tokenized by whitespace and punctuation. Here are some examples on what I would like to see happen: terms given -> terms tokenized foo-bar -> foo,bar one2three4 -> one2three4 multiple words/and some-punctuation -> multiple,words,and,some,punctuation I thought that this combination would work: <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"/

MySql Fulltext search using 2 character word

眉间皱痕 提交于 2019-12-06 04:39:25
问题 I've set ft_min_word_len = 1 and running show variables like 'ft%'; also shows the same. Also have already updated the Fulltext indexes by dropping and re-creating them. But when I run SELECT OriginalProductName FROM products WHERE MATCH (ProductName) AGAINST ('+samsung +tv' IN BOOLEAN MODE); against a row which is having "Samsung Hg55nc890xf 3d 1080p Led lcd Tv Hdtv" as value it returns 0 results. It works as expected when I execute SELECT OriginalProductName FROM products WHERE MATCH