Neo4j: label vs. indexed property?

前端 未结 1 1794
生来不讨喜
生来不讨喜 2020-12-08 05:22

Suppose you\'re Twitter, and:

  • You have (:User) and (:Tweet) nodes;
  • Tweets can get flagged; and
  • You want to
相关标签:
1条回答
  • 2020-12-08 05:25

    UPDATE: Follow up blog post published.

    This is a common question when we model datasets for customers and a typical use case for Active/NonActive entities.

    This is a little feedback about what I've experienced valid for Neo4j2.1.6 :

    Point 1. You will not have difference in db accesses between matching on a label or on an indexed property and return the nodes

    Point 2. The difference will be encountered when such nodes are at the end of a pattern, for example

    MATCH (n:User {id:1})
    WITH n
    MATCH (n)-[:WRITTEN]->(post:Post)
    WHERE post.published = true
    RETURN n, collect(post) as posts;
    

    -

    PROFILE MATCH (n:User) WHERE n._id = 'c084e0ca-22b6-35f8-a786-c07891f108fc'
    > WITH n
    > MATCH (n)-[:WRITTEN]->(post:BlogPost)
    > WHERE post.active = true
    > RETURN n, size(collect(post)) as posts;
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | n                                                                                                                                                         | posts |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Node[118]{_id:"c084e0ca-22b6-35f8-a786-c07891f108fc",login:"joy.wiza",password:"7425b990a544ae26ea764a4473c1863253240128",email:"hayes.shaina@yahoo.com"} | 1     |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row
    
    ColumnFilter(0)
      |
      +Extract
        |
        +ColumnFilter(1)
          |
          +EagerAggregation
            |
            +Filter
              |
              +SimplePatternMatcher
                |
                +SchemaIndex
    
    +----------------------+------+--------+----------------------+----------------------------------------------------------------------------+
    |             Operator | Rows | DbHits |          Identifiers |                                                                      Other |
    +----------------------+------+--------+----------------------+----------------------------------------------------------------------------+
    |      ColumnFilter(0) |    1 |      0 |                      |                                                      keep columns n, posts |
    |              Extract |    1 |      0 |                      |                                                                      posts |
    |      ColumnFilter(1) |    1 |      0 |                      |                                           keep columns n,   AGGREGATION153 |
    |     EagerAggregation |    1 |      0 |                      |                                                                          n |
    |               Filter |    1 |      3 |                      | (hasLabel(post:BlogPost(1)) AND Property(post,active(8)) == {  AUTOBOOL1}) |
    | SimplePatternMatcher |    1 |     12 | n, post,   UNNAMED84 |                                                                            |
    |          SchemaIndex |    1 |      2 |                 n, n |                                                {  AUTOSTRING0}; :User(_id) |
    +----------------------+------+--------+----------------------+----------------------------------------------------------------------------+
    
    Total database accesses: 17
    

    In this case, Cypher will not make use of the index :Post(published).

    Thus the use of labels is more performant in the case you have a ActivePost label for e.g. :

    neo4j-sh (?)$ PROFILE MATCH (n:User) WHERE n._id = 'c084e0ca-22b6-35f8-a786-c07891f108fc'
    > WITH n
    > MATCH (n)-[:WRITTEN]->(post:ActivePost)
    > RETURN n, size(collect(post)) as posts;
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | n                                                                                                                                                         | posts |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Node[118]{_id:"c084e0ca-22b6-35f8-a786-c07891f108fc",login:"joy.wiza",password:"7425b990a544ae26ea764a4473c1863253240128",email:"hayes.shaina@yahoo.com"} | 1     |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row
    
    ColumnFilter(0)
      |
      +Extract
        |
        +ColumnFilter(1)
          |
          +EagerAggregation
            |
            +Filter
              |
              +SimplePatternMatcher
                |
                +SchemaIndex
    
    +----------------------+------+--------+----------------------+----------------------------------+
    |             Operator | Rows | DbHits |          Identifiers |                            Other |
    +----------------------+------+--------+----------------------+----------------------------------+
    |      ColumnFilter(0) |    1 |      0 |                      |            keep columns n, posts |
    |              Extract |    1 |      0 |                      |                            posts |
    |      ColumnFilter(1) |    1 |      0 |                      | keep columns n,   AGGREGATION130 |
    |     EagerAggregation |    1 |      0 |                      |                                n |
    |               Filter |    1 |      1 |                      |     hasLabel(post:ActivePost(2)) |
    | SimplePatternMatcher |    1 |      4 | n, post,   UNNAMED84 |                                  |
    |          SchemaIndex |    1 |      2 |                 n, n |      {  AUTOSTRING0}; :User(_id) |
    +----------------------+------+--------+----------------------+----------------------------------+
    
    Total database accesses: 7
    

    Point 3. Always use labels for positives, meaning for the case above, having a Draft label will force you to execute the following query :

    MATCH (n:User {id:1})
    WITH n
    MATCH (n)-[:POST]->(post:Post)
    WHERE NOT post :Draft
    RETURN n, collect(post) as posts;
    

    Meaning that Cypher will open each node label headers and do a filter on it.

    Point 4. Avoid having the need to match on multiple labels

    MATCH (n:User {id:1})
    WITH n
    MATCH (n)-[:POST]->(post:Post:ActivePost)
    RETURN n, collect(post) as posts;
    
    neo4j-sh (?)$ PROFILE MATCH (n:User) WHERE n._id = 'c084e0ca-22b6-35f8-a786-c07891f108fc'
    > WITH n
    > MATCH (n)-[:WRITTEN]->(post:BlogPost:ActivePost)
    > RETURN n, size(collect(post)) as posts;
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | n                                                                                                                                                         | posts |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Node[118]{_id:"c084e0ca-22b6-35f8-a786-c07891f108fc",login:"joy.wiza",password:"7425b990a544ae26ea764a4473c1863253240128",email:"hayes.shaina@yahoo.com"} | 1     |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row
    
    ColumnFilter(0)
      |
      +Extract
        |
        +ColumnFilter(1)
          |
          +EagerAggregation
            |
            +Filter
              |
              +SimplePatternMatcher
                |
                +SchemaIndex
    
    +----------------------+------+--------+----------------------+---------------------------------------------------------------+
    |             Operator | Rows | DbHits |          Identifiers |                                                         Other |
    +----------------------+------+--------+----------------------+---------------------------------------------------------------+
    |      ColumnFilter(0) |    1 |      0 |                      |                                         keep columns n, posts |
    |              Extract |    1 |      0 |                      |                                                         posts |
    |      ColumnFilter(1) |    1 |      0 |                      |                              keep columns n,   AGGREGATION139 |
    |     EagerAggregation |    1 |      0 |                      |                                                             n |
    |               Filter |    1 |      2 |                      | (hasLabel(post:BlogPost(1)) AND hasLabel(post:ActivePost(2))) |
    | SimplePatternMatcher |    1 |      8 | n, post,   UNNAMED84 |                                                               |
    |          SchemaIndex |    1 |      2 |                 n, n |                                   {  AUTOSTRING0}; :User(_id) |
    +----------------------+------+--------+----------------------+---------------------------------------------------------------+
    
    Total database accesses: 12
    

    This will result in the same process for Cypher that on point 3.

    Point 5. If possible, avoid the need to match on labels by having well typed named relationships

    MATCH (n:User {id:1})
    WITH n
    MATCH (n)-[:PUBLISHED]->(p)
    RETURN n, collect(p) as posts
    

    -

    MATCH (n:User {id:1})
    WITH n
    MATCH (n)-[:DRAFTED]->(post)
    RETURN n, collect(post) as posts;
    
    neo4j-sh (?)$ PROFILE MATCH (n:User) WHERE n._id = 'c084e0ca-22b6-35f8-a786-c07891f108fc'
    > WITH n
    > MATCH (n)-[:DRAFTED]->(post)
    > RETURN n, size(collect(post)) as posts;
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | n                                                                                                                                                         | posts |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Node[118]{_id:"c084e0ca-22b6-35f8-a786-c07891f108fc",login:"joy.wiza",password:"7425b990a544ae26ea764a4473c1863253240128",email:"hayes.shaina@yahoo.com"} | 3     |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row
    
    ColumnFilter(0)
      |
      +Extract
        |
        +ColumnFilter(1)
          |
          +EagerAggregation
            |
            +SimplePatternMatcher
              |
              +SchemaIndex
    
    +----------------------+------+--------+----------------------+----------------------------------+
    |             Operator | Rows | DbHits |          Identifiers |                            Other |
    +----------------------+------+--------+----------------------+----------------------------------+
    |      ColumnFilter(0) |    1 |      0 |                      |            keep columns n, posts |
    |              Extract |    1 |      0 |                      |                            posts |
    |      ColumnFilter(1) |    1 |      0 |                      | keep columns n,   AGGREGATION119 |
    |     EagerAggregation |    1 |      0 |                      |                                n |
    | SimplePatternMatcher |    3 |      0 | n, post,   UNNAMED84 |                                  |
    |          SchemaIndex |    1 |      2 |                 n, n |      {  AUTOSTRING0}; :User(_id) |
    +----------------------+------+--------+----------------------+----------------------------------+
    
    Total database accesses: 2
    

    Will be more performant, because it will use all the power of the graph and just follow the relationships from the node resulting in no more db accesses than matching the user node and thus no filtering on labels.

    This was my 0,02€

    0 讨论(0)
提交回复
热议问题