full-text-indexing

Lucene.Net Best Practices

筅森魡賤 提交于 2019-12-02 14:30:20
What are the best practices in using Lucene.Net? or where can I find a good lucene.net usage sample? Razzie If you're going to work with Lucene, I'd buy a good book that covers it from A to Z. Lucene has a very steep learning curve (in my opinion). It's not only knowing how to search your that's important - it's also about indexing it. Doing a basic search is easy, but creating an index that consists of millions of records of data and still being able to do a lightning fast search over it is possible but pretty hard. There's no tutorial that learns you that. I'd recommend Lucene in Action,

Fulltext Indexing on MyISAM, single column vs multiple column indexing

旧巷老猫 提交于 2019-12-02 12:04:05
I have an extremely large table (4M+ rows) with disk space of more than 40Gb (14Gb data and 28Gb index). I needed fulltext search on multiple fields both combined and separated, meaning that I needed to make it possible to fulltext search on both single columns and multiple columns together, like below: for combined search SELECT `column_a`, `column_b` FROM `table_1` WHERE MATCH (`column_a`, `column_c`, `column_x`) AGAINST ('+$search_quesry*' IN BOOLEAN MODE); for separate search SELECT `column_a`, `column_b` FROM `table_1` WHERE MATCH (`column_a`) AGAINST ('+search_query*' IN BOOLEAN MODE);

Compound FULLTEXT index in MySQL

↘锁芯ラ 提交于 2019-12-02 04:17:26
问题 I would like to make system whitch allows to search user messages, by specific user. assume having folowing table create table messages( user_id int, message nvarchar(500)); So what kind of index I should use here, if I want to search for all messages from user 1, containing word 'foo'. Simple , non unique index user_id It will filter only specific user messages nd then full scan for specific word. FULLTEXT index on message this will find all messages from all users and then filter by ID,

MySQL MATCH AGAINST when searching e-mail addresses

╄→гoц情女王★ 提交于 2019-12-01 00:38:15
I am writing a newsletter script and I need to implement searching in the addresses. I indexed the table with FULLTEXT but when I do a query such as: SELECT * FROM addresses WHERE MATCH(email) AGAINST("name@example.com" IN BOOLEAN MODE) I get strange results. It displays all emails on "example.com" and all emails with user "name". For example I get: john@example.com name@mail.net steven@example.com I rewrote the query to use LIKE "%name@example.com%" but for a big table it takes ridiculous amount of time to complete. Is there a solution for this? I want when searching to show only full

Neo4j auto-index, legacy index and label schema: differences for a relative-to-a-node full-text search

蹲街弑〆低调 提交于 2019-11-30 18:38:28
问题 this question is partially answered in neo4j-legacy-indexes-and-auto-index-vs-new-label-bases-schema-indexes and the-difference-between-legacy-indexing-auto-indexing-and-the-new-indexing-approach I can't comment on them yet and write a new thread here. In my db, I have a legacy index 'topic' and label 'Topic'. I know that: a. pattern MATCH (n:Label) will scan the nodes; b. pattern START (n:Index) will search on legacy index c. auto-index is a sort of legacy index and should gimme same results

Create fulltext index within Entity Framework Coded Migrations

依然范特西╮ 提交于 2019-11-30 17:37:41
TLDR; How do you add a full text index using Entity framework 5 coded migrations I'm having issues adding a full text index to a database using Entity framework migrations. It needs to be there from the start so I'm attempting modifying the InitialCreate migration that was automatically generated to add it. As there isn't a way to do it via the DbMigrations API I've resorted to running inline sql at the end of the 'Up' code. Sql("create fulltext catalog AppNameCatalog;"); Sql("create fulltext index on Document (Data type column Extension) key index [PK_dbo.Document] on AppNameCatalog;"); When

Warning: A long semaphore wait

房东的猫 提交于 2019-11-30 08:32:52
For the past 4 days I have had massive problems with my nightly updates , except for 1 night were it all went fine in between these 4 days. During these updates i update a couple of fulltext indexes. I do it in this manner. Drop the fulltext index Update the fulltext table Add the fulltext index This has been working perfect for over 2 years . Usual update time was around 3-4 hours which was normal for the amount of data that is updated each night. But since Friday really the update times has been between 9-12 hours! Last night the server crashed intentionally by the engine, this was in the

How to get frequently occurring phrases with Lucene

纵然是瞬间 提交于 2019-11-30 07:17:09
问题 I would like to get some frequently occurring phrases with Lucene. I am getting some information from TXT files, and I am losing a lot of context for not having information for phrases e.g. "information retrieval" is indexed as two separate words. What is the way to get the phrases like this? I can not find anything useful on internet, all the advices, links, hints especially examples are appreciated! EDIT: I store my documents just by title and content: Document doc = new Document(); doc.add

Build an index for substring search?

守給你的承諾、 提交于 2019-11-30 04:58:01
问题 I want to do general substring search among billions of strings. The requirement is a little different from general fulltext search because I want a query "ubst" also can hit "substr". Is Lucene or Sphinx capable of doing this? If not, what's the best way do you think to do this? 回答1: Best index structure for this case is suffix tree Lucene does not implements this type of index so its substring search is slow. But lucene has prefix tree index which mean you can do fast search if you search

FullText search with CONTAINS on multiple columns and predicate - AND

邮差的信 提交于 2019-11-30 04:51:35
I have a search table with, say, 4 columns of text data to search. I do something like this: SELECT * FROM dbo.SearchTable WHERE CONTAINS((co1, col2, col3, col4), 'term1 AND term2') It looks like Contains only returns true if term1 and term2 are in the same column. Is there any way to specify that all columns should be included with an AND? If not, my idea is to JSON all search columns and stick them into one. That way I can full text search them but still easily extract the individual columns in .NET. I'm presuming that the indexer won't have a problem with this and will dispense with the