Search for array values except for a list of positions

回眸只為那壹抹淺笑 提交于 2021-02-11 14:37:52

问题


I have tens of millions of documents like the following.

{
    id: "<some unit test id>",
    groupName: "<some group name>",
    result: [
        1, 0, 1, 1, ... 1
    ]
}

Result field is an 200 array of numbers, 0 or 1.

My job is to find, given a groupName, say, "group17" and a few numbers, say, 3, 8, 27 find all the document whose result array elements for the groupName are all equal to 1 disregarding the values at positions 3, 8, 27.

Would appreciate if someone could point out if there is a quick search for it.


回答1:


One way to achieve what you want is to add another field that contains the equivalent integer value of the bitset contained in the result array and then use a bitwise AND operation.

For instance, let's say that the result array is

result: [1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0]

The integer value represented by those bits is 1470, so I store the following document:

PUT test/doc/1
{
    "groupName": "group12",
    "result": [
        1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0
    ],
    "resultLong": "1470"
}

Now, the query would look like this

POST test/_search 
{
  "query": {
    "script": {
      "script": {
        "source": """
        // 1. create a BigInt out of the resultLong value we just computed
        def value = new BigInteger(doc['resultLong'].value.toString());

        // 2. create a bitset filled with 1's except for those positions specified in the ignore parameters array
        def comp = IntStream.range(1, 12).mapToObj(i -> params.ignore.contains(i - 1) ? "0" : "1").collect(Collectors.joining());

        // 3. create a BigInt out of the number we've just created
        def compare = new BigInteger(comp, 2);

        // 4. compare both using a bitwise AND operation
        return value.and(compare).equals(compare);
        """,
        "params": {
          "ignore": [1, 4, 10]
        }
      }
    }
  }
}

Step 2 first creates a string of length 11 filled with 1's or 0's if the current index is in the params.ignore array. We end up with the string "10110111110".

Step 3 then creates a BigInteger out of that string (in base 2).

Step 4 compares both numbers bit by bit, i.e. the document will only be returned if both numbers have 1's at the same positions.

Note: for arrays of length 200, you need to use IntStream.range(1, 201) instead.



来源:https://stackoverflow.com/questions/54955035/search-for-array-values-except-for-a-list-of-positions

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!