term

How to create a document term matrix using native R

匆匆过客 提交于 2019-12-11 09:09:26
问题 I want to create a document term matrix using native R (without additional plugins such as tm). The data is structured as follows: Doc1: the test was to test the test Doc2: we did prepare the exam to test the exam Doc3: was the test the exam Doc4: the exam we did prepare was to test the test Doc5: we were successful so we all passed the exam What i want to reach is the following: Term Doc1 Doc2 Doc3 Doc4 Doc5 DF 1 all 0 0 0 0 1 1 2 did 0 1 0 1 0 2 3 exam 0 2 1 1 1 4 4 passed 0 0 0 0 1 1 回答1:

ElasticSearch - issue with sub term aggregation with array fields

半腔热情 提交于 2019-12-10 18:18:21
问题 I have the two following documents: { "title":"The Avengers", "year":2012, "casting":[ { "name":"Robert Downey Jr.", "category":"Actor", }, { "name":"Chris Evans", "category":"Actor", } ] } and: { "title":"The Judge", "year":2014, "casting":[ { "name":"Robert Downey Jr.", "category":"Producer", }, { "name":"Robert Duvall", "category":"Actor", } ] } I would like to perform aggregations, based on two fields : casting.name and casting.category. I tried with a TermsAggregation based on casting

elasticsearch boost importance of exact phrase match

倾然丶 夕夏残阳落幕 提交于 2019-12-08 23:05:44
问题 Is there a way in elasticsearch to boost the importance of the exact phrase appearing in the the document? For example if I was searching for the phrase "web developer" and if the words "web developer" appeared together they would be boosted by 5 compared to "web" and "developer" appearing separately throughout the document. Thereby any document that contained "web developer" together would appear first in the results. 回答1: You can combine different queries together using a bool query, and

elasticsearch disable term frequency scoring

Deadly 提交于 2019-12-08 17:36:26
问题 I want to change the scoring system in elasticsearch to get rid of counting multiple appearances of a term. For example, I want: "texas texas texas" and "texas" to come out as the same score. I had found this mapping that elasticsearch said would disable term frequency counting but my searches do not come out as the same score: "mappings":{ "business": { "properties" : { "name" : { "type" : "string", "index_options" : "docs", "norms" : { "enabled": false}} } } } } Any help will be appreciated

Accessing words around a positional match in Lucene

爷,独闯天下 提交于 2019-12-08 09:18:11
问题 Given a term match in a document, what’s the best way to access words around that match? I have read this article http://searchhub.org//2009/05/26/accessing-words-around-a-positional-match-in-lucene/, but the problem is that the Lucene API completely changed since this post(2009), could someone point to me how to do this in newer version of Lucene, such as Lucene 4.6.1? EDIT : I figure this out now ( The postings APIs (TermEnum, TermDocsEnum, TermPositionsEnum) have been removed in favor of

Wordpress: hierarchical list of taxonomy terms

别等时光非礼了梦想. 提交于 2019-12-08 03:54:34
问题 I am hitting a wall here, although it sounds pretty simple: I want to return a hierarchical list of custom post type taxonomy terms. What I get is the first level of terms and nested uls. But the sub terms are not showing. Any ideas? Here's the code: function return_terms_index() { $taxonomies = array( 'taxonomy_name', ); $args = array( 'orderby' => 'name', 'order' => 'ASC', 'hide_empty' => false, 'fields' => 'all', 'parent' => 0, 'hierarchical' => true, 'child_of' => 0, 'pad_counts' => false

Is it possible in WordPress to test for an empty term or category?

混江龙づ霸主 提交于 2019-12-08 03:25:57
问题 I have a project that requires me to list out the available terms for each custom post type and indicate visually which of the terms/categories are empty via css/javascript. Is there a way to return a list of terms/categories and say add a class to the empty ones ? Thanks for any and all assistance. 回答1: Yes there is. First you get your terms using get_terms() (I'm assuming your cpt has associated taxonomy with it) <?php $custom_terms = get_terms('my_taxonomy'); if (is_array($custom_terms) &&

Term, nested documents and must_not query incompatible in ElasticSearch?

不打扰是莪最后的温柔 提交于 2019-12-06 23:55:21
问题 I have trouble combining term, must_not queries on nested documents. Sense example can be found here : http://sense.qbox.io/gist/be436a1ffa01e4630a964f48b2d5b3a1ef5fa176 Here my mapping : { "mappings": { "docs" : { "properties": { "tags" : { "type": "nested", "properties" : { "type": { "type": "string", "index": "not_analyzed" } } }, "label" : { "type": "string" } } } } } with two documents in this index : { "tags" : [ {"type" : "POST"}, {"type" : "DELETE"} ], "label" : "item 1" }, { "tags" :

How to find most used phrases in elasticsearch?

我是研究僧i 提交于 2019-12-06 00:56:04
问题 I know that you can find most used terms in an index with using facets. For example on following inputs: "A B C" "AA BB CC" "A AA B BB" "AA B" term facet returns this: B:3 AA:3 A:2 BB:2 CC:1 C:1 But I'm wondering that is it possible to list followings: AA B:2 A B:1 BB CC:1 ....etc... Is there such a feature in ElasticSearch? 回答1: As mentioned in ramseykhalaf's comment, a shingle filter would produce tokens of length "n" words. "settings" : { "analysis" : { "filter" : { "shingle":{ "type":

Term, nested documents and must_not query incompatible in ElasticSearch?

一笑奈何 提交于 2019-12-05 04:09:29
I have trouble combining term, must_not queries on nested documents. Sense example can be found here : http://sense.qbox.io/gist/be436a1ffa01e4630a964f48b2d5b3a1ef5fa176 Here my mapping : { "mappings": { "docs" : { "properties": { "tags" : { "type": "nested", "properties" : { "type": { "type": "string", "index": "not_analyzed" } } }, "label" : { "type": "string" } } } } } with two documents in this index : { "tags" : [ {"type" : "POST"}, {"type" : "DELETE"} ], "label" : "item 1" }, { "tags" : [ {"type" : "POST"} ], "label" : "item 2" } When I query this index like this : { "query": { "nested":