In Neo4j, what level of specificity should be used when granularity level can be unlimited?

后端未结

关注

 2  1775

The hardest thing to wrap my head around when using a graph database, is choosing level of granularity. Lets say I have a graph for things that occur at certain days of the

相关标签:

2条回答

执笔经年

2021-01-03 01:55

The level of granularity of your data model should be driven by your query requirements, not the other way around. That is: when modeling your database, you should ask yourself: "what kind of query I will do over my data?". Based on the answers of this question, you will get a good start point to make a good model with an appropriate granularity level.

In the book Learning Neo4j, by Rik Van Bruggen (you can download in this link) the author says about design graph databases for query-ability:

Like with any database management system, but perhaps even more so for a graph database management system such as Neo4j, your queries will drive your model. What we mean with this is that, exactly like it was with any type of database that you may have used in the past or would still be using today, you will need to make specific design decisions based on specific trade-offs. Therefore, it follows that there is no one perfect way to model in a graph database such as Neo4j. It will all depend on the questions that you want to ask of the data, and this will drive your design and your model.

So, based on this, the answer of your question "what level of specificity should be used when granularity level can be unlimited?" is: it depends on your query requirements. Think first in the queries you will do, and after in the data model.

My suggestion is: keep your model as simple as possible in the beginning and, when required, make gradual changes.

0 讨论(0)
发布评论:

提交评论
- 加载中...
遥遥无期

2021-01-03 01:56
First i recommend you to think about what you want to do with your data. You dont use a graphdatabase for just storing the data, you also want to do something with it. So you probably have a specific use-case like pathfinding. In this case there are not as many options but still there are different way to model the data. In this case I would take a look at the algorithms already provided and whether they are able to handle the things I want to do with it. So lets say I want to use the apoc.algo.aStar because it is able to do the things that I want to do. At this point I limit myself because the aStar is only able to handle the weights on the relationships and the algorithm wants to have the coordinates on the nodes. This is probably also the first schema you thought of but i think you get the idea behind it. If there is no algorithm for your problem, you are going to make an algorithm yourself. Take a look at the options you have and you will often be limited to a certain way of modeling your data.

As you already said the way you handle your data also has an effect on how fast you can query certain things. For example you model a map so you have a point A and point B you want to go from A to B and B to A. The problem in neo4j is you dont have a bidirectional edge. So you might consider adding 2 edges, from A to B and from B to A. Dont do it! The performance will suffer alot for traversing.
- I can make each day a node (Mon, Tue, Wed, ...), that way, querying for specific days is fast.
- I can make a node called Day, and add the property name with the day of the week. That way, showing all days in a graph is easy to query for.
Ask yourself why you have this database and what you want to do with it and dont forget about indexing. You can still create an index to get some performance back, which you still had in the first example. Also avoid adding redundant data. For example a node day connected to all weekdays. Everybody knows friday is a day. Only consider doing this if you benefit from it. After modeling a few graphs and also writing your own algorithms for the graphs you will get a feeling for it. At some point you will know how to design your graphs best for specific usecases. Experience is the key in designing graphs, knowing the limitations of the algorithms you can already use and things you can do by yourself. I hope this helps.
0 讨论(0)
发布评论:

提交评论
- 加载中...