Amazon Cloudsearch : Filter if exists

走远了吗. 提交于 2019-12-19 05:58:52

问题


I have an amazon cloudsearch domain. The aim is to filter if the field 'language' exists. Not all objects have a language, and I want to have the ones which do have a language filtered, but the ones that do not have any language to also be returned.

I want to filter with ( or language:'en' language:null )

However null cannot be passed within a string.

Is this possible? If so how would it be done.


回答1:


I looked elsewhere aswell, it seems :

The simplest way to do that, is to set a default value for the field, and then use that value for your null.

For example, set the default to the string "null", then you can easily test for that.

I believe you can add a default value, and re-index, and that should reapply the default.




回答2:


If you are willing to use the Lucene query parser you can express your query like this:

(*:* OR -language:*) OR language:en

Note: The funky (*:* OR ...) construct is necessary because of the way Lucene treats negated OR clauses.

In general, you can filter by existence / non-existence of a field with the Lucene query parser:

All documents containing field: field:[* TO *]

All documents not containing field: -field:[* TO *]

Note: If field is textual (text or literal datatypes) you don't need range queries and you can shorten the above to:

field:* and -field:*




回答3:


There is no way to cleanly do exactly what you want, but here are two options:

  1. Index a new field called something like has_language, setting its value to language!=null at doc submission time.
  2. This is more of a hack because range should only be used with integers, but I have used it successfully on literal fields (range field=language [0,}).



回答4:


You can search for existence by using the prefix or range operators depending on your field type. If the type is a term or a string then you can use prefix like so:

(prefix field=example '')

This will yield only results that are not null for the field example.

For dates you can use an inclusive date range:

(range field=updated ['0000-01-01T00:00:00.000Z',})

This will only include items with an updated date after the given time, items with a null updated date will not be included. You can do other similar searches for other field types.

Similarly you can use the not operator to get the set of items with null fields.

For example, All items with a null example field:

(not (prefix field=example ''))


来源:https://stackoverflow.com/questions/26591773/amazon-cloudsearch-filter-if-exists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!