fuzzy-search

SSIS fuzzy lookup with multiple outputs per lookup error

心已入冬 提交于 2019-12-11 05:48:35
问题 I have a pretty simple SSIS package with 3 components: OLE DB Source Fuzzy Lookup OLE DB Destination In the fuzzy lookup component I changed in the advanced tab the "Maximum number of matches to output per lookup" from 1 to 2. When I run the package after the change I get this error message: [OLE DB Destination [57]] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult:

How to find an analyzed term with a fuzzy (approximate) search in Lucene-3x?

泪湿孤枕 提交于 2019-12-11 01:58:43
问题 The query ' laser~ ' doesn't find ' laser '. I'm using Lucene's GermanAnalyzer to store documents in the index. I save two documents with "title" fields "laser" and "labor" respectively. Afterwards I perform a fuzzy query laser~ . Lucene only finds the document that contains "labor". What is the Lucene-3x way to implement such searches? By taking a look at the Lucene source code, I guess that fuzzy searches are not designed to work with "analyzed" content, but I'm not sure whether this is the

Elasticsearch - Fuzzy, phrase, completion suggestor and dashes

戏子无情 提交于 2019-12-10 20:37:25
问题 So I have been asking separate questions trying to achieve the search functionality I would like to achieve but still falling short so thought I would just ask people what they suggest for the optimal Elasticsearch settings, mappings, indexing and query structure to do what I am looking for. I need a search as you type solution that queries categories. If I typed in "mex" I am looking to get back results like "Mexican Restaurant", "Mexican Grocery Store", "Tex-Mex Restaurant" and "Medical

Find a series of data using non-exact measurements (fuzzy logic)

前提是你 提交于 2019-12-10 04:13:28
问题 This is a more complex follow-up question to: Efficient way to look up sequential values Each Product can have many Segment rows (thousands). Each segment has position column that starts at 1 for each product (1, 2, 3, 4, 5, etc.) and a value column that can contain any values such as (323.113, 5423.231, 873.42, 422.64, 763.1, etc.). The data is read-only. It may help to think of the product as a song and the segments as a set of musical notes in the song. Given a subset of contiguous

How can I do fuzzy substring matching in Ruby?

社会主义新天地 提交于 2019-12-09 14:16:46
问题 I found lots of links about fuzzy matching, comparing one string to another and seeing which gets the highest similarity score. I have one very long string, which is a document, and a substring. The substring came from the original document, but has been converted several times, so weird artifacts might have been introduced, such as a space here, a dash there. The substring will match a section of the text in the original document 99% or more. I am not matching to see from which document this

Fuzzy Text Matching C#

半世苍凉 提交于 2019-12-09 04:35:41
问题 I'm writing a desktop UI (.Net WinForms) to assist a photographer clean up his image meta data. There is a list of 66k+ phrases. Can anyone suggest a good open source/free .NET component I can use that employs some sort of algorithm to identify potential candiates for consolidation? For example there may be two or more entries which are actually the same word or phrase that only differ by whitespace or punctuation or even slight mis-spelling. The application will ultimately rely on the user

Fuzzy matching multiple words in string

▼魔方 西西 提交于 2019-12-08 07:33:17
问题 I'm trying to employ the help of the Levenshtein Distance to find fuzzy keywords(static text) on an OCR page. To do this, I want to give a percentage of errors that are allowed (say, 15%). string Keyword = "past due electric service"; Since the keyword is 25 characters long, I want to allow for 4 errors (25 * .15 rounded up) I need to be able to compare it to... string Entire_OCR_Page = "previous bill amount payment received on 12/26/13 thank you! current electric service total balances

Fuzzy string matching using Levenshtein algorithm in Elasticsearch

时光毁灭记忆、已成空白 提交于 2019-12-08 06:44:50
问题 I have just started exploring Elasticsearch. I created a document as follows: curl -XPUT "http://localhost:9200/cities/city/1" -d' { "name": "Saint Louis" }' I now tried do a fuzzy search on the name field with a Levenshtein distance of 5 as follows : curl -XGET "http://localhost:9200/_search " -d' { "query": { "fuzzy": { "name" : { "value" : "St. Louis", "fuzziness" : 5 } } } }' But its not returning any match. I expect the Saint Louis record to be returned. How can i fix my query ? Thanks.

fuzzy search with active record query interface

三世轮回 提交于 2019-12-08 02:32:23
问题 I have a fuzzy search in my rails app, which sql what I want is this: select * from `user` where name like '%abc%' I've tried to do it like this: name = 'abc' User.where("name like '%?%'", name) It failed, in console it logged: select * from `user` where name like '%'abc'%' Finally I tried this name = 'abc' User.where("name like ?", '%' + name + '%') It worked. But I think it doesn't like rails way, is there any better way to do that? 回答1: User.where("name REGEXP ?", 'regex_str') and regex

Fuzzy Search on a Concatenated Full Name using NHibernate

亡梦爱人 提交于 2019-12-07 03:22:41
问题 I am trying to convert the following SQL into NHibernate: SELECT * FROM dbo.Customer WHERE FirstName + ' ' + LastName LIKE '%' + 'bob smith' + '%' I was trying to do something like this but it is not working: name = "%" + name + "%"; var customers = _session.QueryOver<Customer>() .Where(NHibernate.Criterion.Restrictions.On<Customer>(c => c.FirstName + ' ' + c.LastName).IsLike(name)) .List(); What I'm basically trying to do is be able to search for a customer's name in a text box with the