N1QL Query times out when Using parameterized IN clause

拥有回忆 提交于 2019-12-06 22:40:21

问题


Using Couchbase server 4.1.0 (and 4.5), Java SDK 2.2.8 (also tried with 2.2.7, 2.3.1, & 2.3.3), I have a query leveraging a secondary index which runs fine when I run my code locally and even via CBQ (CBQ takes about 3ms) on the AWS server. However, when running my app on AWS, I get a TimeOutException and it's only one query which is timing out, others are not. See details below.

May be worth noting my Couchbase setup has 3 buckets.

Example Doc:

"bucketName": {
    "userName": "User_A",
    "MessageContent": "This is a message",
    "docType": "msg",
    "ParentMsgId": "1234",
    "MsgType": "test",
    "expireTimestamp": 1454975772613,
    "publishTimestamp": 1455322362028,
    "id": "145826845",
    "urls": [],
    "subject": "this is a subject",
    "type": 1,
    "GroupId": "Group_1"
}

Secondary Index:

CREATE INDEX `indexName` ON `bucketName`(`ParentMsgId`,`docType`,`publishTimestamp`) USING GSI

Example Query extracted from N1qlQuery#n1ql()

{"statement":
"select count(*) as msgCount from bucketName 
where ParentMsgId is not missing and docType = 'msg' 
and ParentMsgId IN $parentId 
and publishTimestamp between $startTime and $endTime
","$endTime":1470726861816,
  "$startTime":1470640461816,
  "$parenIds":["fa11845b-9ea5-4778-95fe-e7206843c69b"]
}

Java Code

public static final String COUNT_STATEMENT = "select count(*) as count " +
            "from bucketName " +
            "where ParentMsgId is not missing " + 
            "and docType = 'msg' " +
            "and ParentMsgId IN $parentIds " + 
            "and publishTimestamp between $startTime and $endTime";

public int getCountForDuration(Long startTime, Long endTime, Collection<String> parentIds){
    List<String> idList = new ArrayList<>(parentIds);
    JsonObject placeHolders = JsonObject.create()
                                        .put("parentIds", JsonArray.from(idList))
                                        .put("startTime", startTime)
                                        .put("endTime", endTime);
    N1qlQuery query = N1qlQuery.parameterized(COUNT_STATEMENT, placeHolders)            
    N1qlQueryResult result = bucket.query(query);
    ...
}

Query Explain result

cbq> explain select count(*) as msgCount from bucketName where ParentMsgId is not missing and docType = 'msg' and ParentMsgId IN ["01b88f7f-4de6-4daa-9562-a2c902e818ad"] and publishTimestamp between 1466445409000 and 1466531809000;
{
    "requestID": "61afcf02-3b3d-4c8a-aec6-b76c4c1f7b17",
    "signature": "json",
    "results": [
        {
            "#operator": "Sequence",
            "~children": [
                {
                    "#operator": "IndexScan",
                    "index": "indexName",
                    "keyspace": "bucketName",
                    "namespace": "default",
                    "spans": [
                        {
                            "Range": {
                                "High": [
                                    "successor(\"01b88f7f-4de6-4daa-9562-a2c902e818ad\")"
                                ],
                                "Inclusion": 1,
                                "Low": [
                                    "\"01b88f7f-4de6-4daa-9562-a2c902e818ad\""
                                ]
                            }
                        }
                    ],
                    "using": "gsi"
                },
                {
                    "#operator": "Parallel",
                    "~child": {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "Fetch",
                                "keyspace": "bucketName",
                                "namespace": "default"
                            },
                            {
                                "#operator": "Filter",
                                "condition": "(((((`bucketName`.`ParentMsgId`) is not missing) and ((`bucketName`.`docType`) = \"msg\")) and ((`bucketName`.`ParentMsgId`) in [\"01b88f7f-4de6-4daa-9562-a2c902e818ad\"])) and ((`bucketName`.`publishTimestamp`) between 1466445409000 and 1466531809000))"
                            },
                            {
                                "#operator": "InitialGroup",
                                "aggregates": [
                                    "count(*)"
                                ],
                                "group_keys": []
                            }
                        ]
                    }
                },
                {
                    "#operator": "IntermediateGroup",
                    "aggregates": [
                        "count(*)"
                    ],
                    "group_keys": []
                },
                {
                    "#operator": "FinalGroup",
                    "aggregates": [
                        "count(*)"
                    ],
                    "group_keys": []
                },
                {
                    "#operator": "Parallel",
                    "~child": {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "InitialProject",
                                "result_terms": [
                                    {
                                        "as": "msgCount",
                                        "expr": "count(*)"
                                    }
                                ]
                            },
                            {
                                "#operator": "FinalProject"
                            }
                        ]
                    }
                }
            ]
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "2.748194ms",
        "executionTime": "2.660232ms",
        "resultCount": 1,
        "resultSize": 3274
    }
}

Logs

java.lang.Thread.run(Thread.java:745)
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:533)
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:598)
org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:293)
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:50)
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:218)
org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:582)
org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:919)
org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:441)
org.eclipse.jetty.server.Server.handle(Server.java:349)
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
org.ops4j.pax.web.service.jetty.internal.HttpServiceContext.doHandle(HttpServiceContext.java:117)
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:483)
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
org.ops4j.pax.web.service.jetty.internal.HttpServiceServletHandler.doHandle(HttpServiceServletHandler.java:70)
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
org.apache.cxf.transport.servlet.AbstractHTTPServlet.service(AbstractHTTPServlet.java:201)
javax.servlet.http.HttpServlet.service(HttpServlet.java:693)
org.apache.cxf.transport.servlet.AbstractHTTPServlet.doGet(AbstractHTTPServlet.java:150)
org.apache.cxf.transport.servlet.AbstractHTTPServlet.handleRequest(AbstractHTTPServlet.java:225)
org.apache.cxf.transport.servlet.CXFNonSpringServlet.invoke(CXFNonSpringServlet.java:130)
org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:194)
org.apache.cxf.transport.servlet.ServletController.invokeDestination(ServletController.java:214)
org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:237)
org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:262)
org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:94)
org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:58)
org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:89)
org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:168)
org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96)
org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:180)
java.lang.reflect.Method.invoke(Method.java:498)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
biz.te2.core.services.beacon.impl.BeaconResource.getVenuesBeaconData(BeaconResource.java:105)
xxx.xxx.xxx.getBeaconHealthForRangeAndVenue(BeaconHealthServiceImpl.java:40)
xxx.xxx.xxx..getAllMessagesCount(BeaconHealthServiceImpl.java:80)
com.sun.proxy.$Proxy146.getMessageCountForDuration(Unknown Source)
org.apache.aries.proxy.impl.ProxyHandler.invoke(ProxyHandler.java:78)
org.apache.aries.proxy.impl.DefaultWrapper.invoke(DefaultWrapper.java:31)
org.apache.aries.proxy.impl.ProxyHandler$1.invoke(ProxyHandler.java:50)
java.lang.reflect.Method.invoke(Method.java:498)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
xxx.xxx.xxx.getMessageCountForDuration(MessageCouchbaseRepo.java:364)
xxx.xxx.xxx.getN1qlQueryRows(MessageCouchbaseRepo.java:372)
com.couchbase.client.java.CouchbaseBucket.query(CouchbaseBucket.java:582)
com.couchbase.client.java.CouchbaseBucket.query(CouchbaseBucket.java:656)
com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:74)
java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)

This is an example of a query which works fine, as requested.

public static final String EXPERIENCE_ID_STATEMENT = "Select id " +
            "from read as exp " +
            "where id is not missing " +
            "and docType = 'experience' " +
            "and venueId = $venueId " +
            "and exp.rule.poiIds is not missing " +
            "and any poi in exp.rule.poiIds satisfies poi = $poiId end";

The only thing unique about this query versus the others, is it uses an IN clause and receives the fields via a parameterized JsonArray.

There are no network delays. I don't think this is an issue, as other queries are working and they are essentially chained called one after another (I also tested running this query alone and it still performs extremely slow).

App and CB are both on AWS. I have tested with both on same AWS server and on different servers as well in both cases issue is there. I have a client on AWS and not on AWS, both have the issue. By a client I mean a mechanism that invokes my app. It still gets a timeout when the query is called.

My collectinfo couchbase logs are here s3.amazonaws.com/cb-customers/TE2/


回答1:


I found the issue it has to do with the parameterization of the values in the IN clause. When I removed the parameterization from the query I was able to run as fast as CBQ. My only option was to drop parameterizing.I Attempted to adjust the secondary index a bit moving the ParentMsgId to the end of the fields list this did not help in my case.

The underlying problem diagnosed from couchbase rep.

the underlying issue is that once you put the IN values in a parameter, the optimiser can't assume the IN clause only has one value. Since experienceId leads the index, we can only use an equality to travel down the index and start counting the keys that apply because we would have to skip and scan. I would suggest creating the index as (docType, publishTimestamp, ParentMsgId), that way we can still scan the index on the conditions set on docType, publishTimestamp, and apply the filter on experienceId later on in the pipeline.




回答2:


Can you check/post the EXPLAIN output for this query. Also, check the query/index logs on server (and no networking delays).

running my app from AWS I get a TimeOutException it is only this query which is timing out others are not.

Are both client/app and CB/server in AWS??

And, post the other query which is working in this setup. What is unique about this query that isn't working??

-Prasad



来源:https://stackoverflow.com/questions/38845615/n1ql-query-times-out-when-using-parameterized-in-clause

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!