will Gremlin graph queries always perform operations in it's own address space?

我的未来我决定 提交于 2019-12-12 01:29:42

问题


admittedly, most of my database experience is relational. one of the tenets in that space is to avoid moving data over the network. this manifests by using something like:

select * from person order by last_name limit 10

which will presumably order and limit within the database engine vs using something like:

select * from person

and subsequently ordering and taking the top 10 at the client which could have disastrous effects if there are a million person records.

so, with Gremlin (from Groovy), if i do something like:

g.V().has('@class', 'Person').order{println('!'); it.a.last_name <=> it.b.last_name}[0..9]

i am seeing the ! printed, so i am assuming that this bringing all Person records into the address space of my client prior to the order and limit steps which is not the desired effect.

do my options for processing queries entirely in the database engine become product specific (e.g. for orient-db perhaps submit the query in their flavor of SQL), or is there something about Gremlin that i am missing?


回答1:


If you want the implementer's query optimizer to kick in, you need to use as many Gremlin steps as possible and avoid pure Groovy/in-memory processing of your graph traversals.

You're most likely looking for something like this (as of TinkerPop v3.2.0):

g.V().has('@class', 'Person').order().by('last_name', incr).limit(10)

If you find yourself using lambdas, chances are often high that this could be done with pure Gremlin steps. Favor Gremlin steps over lambdas.

See TinkerPop v3.2.0 documentation:

  • Order By step
  • Limit step


来源:https://stackoverflow.com/questions/37510921/will-gremlin-graph-queries-always-perform-operations-in-its-own-address-space

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!