When should I use UNNEST vs ANY…SATISFIES in N1ql?

问题

I want to query (or index) an array-valued field.

As an example, say I want to retrieve this document { "myarray": [ 1, 2, 3]}.

I can do this with ANY...SATISFIES or with UNNEST. From the documentation, these seem functionally the same.

SELECT * FROM `bucket` AND ANY v in myarray SATISFIES v=3 END;

SELECT * FROM `bucket` UNNEST myarray v WHERE v=3

What are the use cases for each?

回答1:

For those two queries, they do similar things, but both of these approaches provide other functionality.

The actual results of both those queries should be different. The first query will return the array data as is, while UNNEST will flatten the array.

UNNEST is an intra-document join. SATISFIES allows you to (as you've done), check an array to see if it meets some criteria, but it doesn't actually transform the array in the results in any way.

Update:

It isn't necessarily a matter of 'which is better'. Both of these queries do different things. Let's suppose your document looks like this:

{
  "foo": "bar",
  "myarray": [
    1,
    2,
    3
  ]
}

Now let's suppose you remove the WHERE from both of those queries.

Then, running this query:

SELECT d.foo, d.myarray, v
FROM `demo` d
UNNEST d.myarray v

You get 3 results, because a join is taking place. Like this:

[
{"foo":"bar","myarray":[1,2,3],"v":1},
{"foo":"bar","myarray":[1,2,3],"v":2},
{"foo":"bar","myarray":[1,2,3],"v":3}
]

With the other query:

SELECT d.*
FROM `demo` d

You get one result, because there is no join happening. It's an intra-document predicate, but not an intra-document join.

[{"foo":"bar","myarray":[1,2,3]}]

As to which one to use? It depends on your use case, generally speaking. Stack Overflow isn't for giving such laser-specific advice. If you're merely after speed, I would recommend testing both on your real data, to see which is more efficient (your sample document probably isn't your real document).

Indexing is also a factor. Again, based only on your sample document, for the SATISFIES query, you'd probably create an index like this:

CREATE INDEX adv_DISTINCT_myarray ON `demo`(DISTINCT `myarray`)

And for the UNNEST query, you'd probably create an index like this:

CREATE INDEX adv_ALL_myarray ON `demo`(ALL `myarray`)

Those indexes assume that all you're doing is checking myarray for a single value. If your real queries are more complex, you'll need a more complex index.

One additional note: Behind the scenes, in the query engine, I have no idea what the implementation difference is, so I would have to go with Johan's advice on the UNNEST being more expensive. But your mileage may vary, so I would recommend trying both and doing some benchmarks.

回答2:

The first one is an intra-document predicate, and the results of the query are documents from "bucket". The second one does a join of each document in "bucket" with the values in "myarray" , and each result of the query is a copy of a document in "bucket" and one value from "myarray".

Generally speaking, expect the second option to be much more expensive.

来源：https://stackoverflow.com/questions/55833063/when-should-i-use-unnest-vs-any-satisfies-in-n1ql

标签

couchbase

n1ql