shingles

elasticsearch word proximity

佐手、 提交于 2019-12-12 04:18:46
问题 In elasticsearch is there a way to increase the score of documents where query words are close to each other in the document? It's not only about words that are together, as this could be solved by using shingles, but about words that are in proximity where there might be another unimportant word inbetween. Example: document 1: close words in documents detection document 2: close words in detection documents query: close documents So I'd like to get a higher score for the first document and a

ClassNotFoundException in Hadoop

时光毁灭记忆、已成空白 提交于 2019-12-12 03:37:03
问题 Using Hadoop mapreduce I am writing code to get substrings of different lengths. Example given string "ZYXCBA" and length 3. My code has to return all possible strings of length 3 ("ZYX","YXC","XCB","CBA"), length 4("ZYXC","YXCB","XCBA") finally length 5("ZYXCB","YXCBA"). In map phase I did the following: key = length of substrings I want value = "ZYXCBA". So mapper output is 3,"ZYXCBA" 4,"ZYXCBA" 5,"ZYXCBA" In reduce I take string ("ZYXCBA") and key 3 to get all substrings of length 3. Same