How can I filter documents that have a field which is an array and has more than N elements?
How can I filter documents that have a field which is an empty array?
Easiest way to do this is to "denormalize" your data so that you have a property that contains the count and a boolean if it exists or not. Then you can just search on those properties.
For example:
{
"id": 31939,
"hasAttachments": true,
"attachmentCount": 2,
"attachments": [
{
"type": "Attachment",
"name": "txt.txt",
"mimeType": "text/plain"
},
{
"type": "Inline",
"name": "jpg.jpg",
"mimeType": "image/jpeg"
}
]
}
Still posting to here for who stuck same situation with me. Let's say your data look like this:
{
"_source": {
"fieldName" : [
{
"f1": "value 11",
"f2": "value 21"
},
{
"f1": "value 12",
"f2": "value 22"
}
]
}
}
Then to filter fieldName
with length > 1 for example:
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline": "doc['fieldName.f1'].values.length > 1",
"lang": "painless"
}
}
}
}
}
The script syntax is as ES 5.4 documentation https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-query.html.
Imho the correct way of filtering arrays by size using scripting is :
"filter" : {
"script" : {
"script" : "_source.fieldName.size() > 1"
}
}
If I do that as @javanna suggests it throws exception groovy.lang.MissingPropertyException: No such property: length for class: java.lang.String
When you need to find documents which contains some field which size/length should be larger then zero @javanna gave correct answer. I only wanted to add if your field is text field and you want to find documents which contains some text in that field you can't use same query. You will need to do something like this:
GET index/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"FIELD_NAME": {
"gt": 0
}
}
}
]
}
}
}
This is not exact answer to this question because answer already exists but solution for similar problem which I had so maybe somebody will find it useful.
I would have a look at the script filter. The following filter should return only the documents that have at least 10 elements in the fieldname
field, which is an array. Keep in mind that this could be expensive depending on how many documents you have in your index.
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.length > 10"
}
}
Regarding the second question: do you really have an empty array there? Or is it just an array field with no value? You can use the missing filter to get documents which have no value for a specific field:
"filter" : {
"missing" : { "field" : "user" }
}
Otherwise I guess you need to use scripting again, similarly to what I suggested above, just with a different length as input. If the length is constant I'd put it in the params
section so that the script will be cached by elasticsearch and reused, since it's always the same:
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.length > param1"
"params" : {
"param1" : 10
}
}
}
If you have an array of objects that aren't mapped as nested
, keep in mind that Elastic will flatten them into:
attachments: [{size: 123}, {size: 456}] --> attachments.size: [123, 456]
So you want to reference your field as doc['attachments.size'].length
, not doc['attachments'].length
, which is very counter-intuitive.
Same for doc.containsKey(attachments.size)
.
The .values
part is deprecated and no longer needed.