Managing hierarchies in SQL: MPTT/nested sets vs adjacency lists vs storing paths

后端 未结 2 1011
余生分开走
余生分开走 2020-12-24 09:09

For a while now I\'ve been wrestling with how best to handle hierarchies in SQL. Frustrated by the limitations of adjacency lists and the complexity of MPTT/nested sets, I b

相关标签:
2条回答
  • 2020-12-24 09:36

    It problem with your conclusion is that it ignores most of the issues involved in working with trees.

    By reducing the validity of a technique to the "number of calls" you effectively ignore all of the issues which well understood data structures and algorithms attempt to solve; that is, fastest execution and low memory and resource foot print.

    The "number of calls" to an SQL server may seem like a good metric to use ("look ma less code"), but if the result is a program which never finishes, runs slowly, or takes up to much space, it is in fact a useless metric.

    By storing the path with every node you are not creating a tree data structure. Instead you are creating a list. Any operation which a tree is designed to optimize is lost.

    This might be hard to see with small date sets (and in many cases of small trees a list is better), try some examples on data sets of size 500, 1000, 10k -- You will quickly see why storing the whole path is not a good idea.

    0 讨论(0)
  • 2020-12-24 09:53

    You might also consider the Closure Table design I describe in my answer to What is the most efficient/elegant way to parse a flat table into a tree?

    Calls required to create/delete/move a node:

    • Closure = 1

    Calls required to get a tree:

    • Closure = 1

    Calls required to get path to a node / ancestry:

    • Closure = 1

    Calls required to get number of subnodes:

    • Closure = 1

    Calls required to get depth of node:

    • Closure = 1

    DB fields required:

    • Adjancency = 1 more field / row
    • Path = 1 more field / row
    • MPTT = 2 or 3 more fields / row
    • Closure = 2 or 3 fields in extra table. This table has O(n^2) rows worst case but far fewer than that in most practical cases.

    There are a couple of other considerations:

    Supports unlimited depth:

    • Adjacency = yes
    • MPTT = yes
    • Path = no
    • Closure = yes

    Supports referential integrity:

    • Adjacency = yes
    • MPTT = no
    • Path = no
    • Closure = yes

    I also cover Closure Table in my presentation Models for Hierarchical Data with SQL and PHP, and my book, SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

    0 讨论(0)
提交回复
热议问题