shingles

elasticsearch word proximity

阅读更多关于 elasticsearch word proximity

问题 In elasticsearch is there a way to increase the score of documents where query words are close to each other in the document? It's not only about words that are together, as this could be solved by using shingles, but about words that are in proximity where there might be another unimportant word inbetween. Example: document 1: close words in documents detection document 2: close words in detection documents query: close documents So I'd like to get a higher score for the first document and a

ClassNotFoundException in Hadoop

阅读更多关于 ClassNotFoundException in Hadoop

问题 Using Hadoop mapreduce I am writing code to get substrings of different lengths. Example given string "ZYXCBA" and length 3. My code has to return all possible strings of length 3 ("ZYX","YXC","XCB","CBA"), length 4("ZYXC","YXCB","XCBA") finally length 5("ZYXCB","YXCBA"). In map phase I did the following: key = length of substrings I want value = "ZYXCBA". So mapper output is 3,"ZYXCBA" 4,"ZYXCBA" 5,"ZYXCBA" In reduce I take string ("ZYXCBA") and key 3 to get all substrings of length 3. Same

订阅 shingles