Neo4J - Storing into relationship vs nodes

大兔子大兔子 提交于 2019-12-05 14:01:46

The correct data model depends on the types of queries you need to make. You should figure out what your queries are, and then determine a data model that meets these criteria:

  1. It allows you to answer all your queries,
  2. It allows your queries to finish sufficiently quickly,
  3. It minimizes the DB storage needed.

In the case of discussion comments, it is likely that you want to query for discussion threads, ordered chronologically. Therefore, you need to store not just the times at which comments are made, but also the relationships between the comments (because a discussion can spawn disjoint threads that do not share the same prior comments).

Let's try a simple test case. Suppose there are 2 disjoint threads spawned by the same initial comment (which we'll call c1): [c1, c3] and [c1, c2, c4]. And suppose, in this simple test case, that we are only interested in querying for all comment threads related to a subject.

If comment properties are stored in nodes, the data might look like:

(u1:User {name: "A"})-[:MADE]->(c1:Comment {time:0, text: "Fee"})-[:ABOUT]->(s1:Subject {title: "Jack"})
(u2:User {name: "B"})-[:MADE]->(c2:Comment {time:1, text: "Fie"})-[:ABOUT]->(c1)
(u3:User {name: "C"})-[:MADE]->(c3:Comment {time:3, text: "Foe"})-[:ABOUT]->(c1)
(u4:User {name: "D"})-[:MADE]->(c4:Comment {time:9, text: "Fum"})-[:ABOUT]->(c2)

If you instead stored the comment properties in relationships, you might try something like the following, but there is a BIG FLAW. There is no way for a relationship to point directly to another relationship (as we try to do in lines 2 to 4). Since this model is not legal in neo4j, it fails to meet any the criteria above.

(u1:User {name: "A"})-[c1:COMMENTED_ABOUT {time:0, text: "Fee"}]->(s1:Subject {title: "Jack"})
(u2:User {name: "B"})-[c2:COMMENTED_ABOUT {time:1, text: "Fie"}]->(c1)
(u3:User {name: "C"})-[c3:COMMENTED_ABOUT {time:3, text: "Foe"}]->(c1)
(u4:User {name: "D"})-[c4:COMMENTED_ABOUT {time:9, text: "Fum"}]->(c2)

Therefore, in our simple test case, it looks like storing the properties in nodes is the only choice.

Here is a query for getting the disjoint thread paths, including the user who made each comment (the WHERE clause filters out partial threads):

MATCH p=(s:Subject)<-[:ABOUT*]-(c:Comment)<-[m:MADE]-(u:User)
WHERE NOT (c)<-[:ABOUT]-()
RETURN p
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!