I have an amazon cloudsearch domain. The aim is to filter if the field 'language' exists. Not all objects have a language, and I want to have the ones which do have a language filtered, but the ones that do not have any language to also be returned.
I want to filter with ( or language:'en' language:null )
However null cannot be passed within a string.
Is this possible? If so how would it be done.
I looked elsewhere aswell, it seems :
The simplest way to do that, is to set a default value for the field, and then use that value for your null.
For example, set the default to the string "null", then you can easily test for that.
I believe you can add a default value, and re-index, and that should reapply the default.
If you are willing to use the Lucene query parser you can express your query like this:
(*:* OR -language:*) OR language:en
Note: The funky (*:* OR ...)
construct is necessary because of the way Lucene treats negated OR clauses.
In general, you can filter by existence / non-existence of a field with the Lucene query parser:
All documents containing field
: field:[* TO *]
All documents not containing field
: -field:[* TO *]
Note: If field
is textual (text or literal datatypes) you don't need range queries and you can shorten the above to:
field:*
and -field:*
There is no way to cleanly do exactly what you want, but here are two options:
- Index a new field called something like
has_language
, setting its value tolanguage!=null
at doc submission time. - This is more of a hack because range should only be used with integers, but I have used it successfully on literal fields
(range field=language [0,})
.
You can search for existence by using the prefix
or range
operators depending on your field type. If the type is a term or a string then you can use prefix like so:
(prefix field=example '')
This will yield only results that are not null for the field example
.
For dates you can use an inclusive date range:
(range field=updated ['0000-01-01T00:00:00.000Z',})
This will only include items with an updated
date after the given time, items with a null updated date will not be included. You can do other similar searches for other field types.
Similarly you can use the not
operator to get the set of items with null fields.
For example, All items with a null example
field:
(not (prefix field=example ''))
来源:https://stackoverflow.com/questions/26591773/amazon-cloudsearch-filter-if-exists