Nicely printing/showing a binary tree in Haskell

前端 未结 3 498
故里飘歌
故里飘歌 2021-02-05 07:33

I have a tree data type:

data Tree a b = Branch b (Tree a b) (Tree a b) | Leaf a

...and I need to make it an instance of Show, wit

3条回答
  •  庸人自扰
    2021-02-05 08:23

    Using applicative's link to the Data.Tree source I came up with this. I wanted to write my own so I could learn more about it. The drawTree method in the source is generalized to work with nodes with multiple children; mine is just for binary trees.

    Note: my tree definition is a little different than the OP's. I don't quite understand what the a type parameter is used for, but the approach should still be the same

    data Tree a
        = Branch (Tree a) a (Tree a)
        | Leaf
    
    prettyprint (Leaf)
        = "Empty root."
    -- unlines concats a list with newlines
    prettyprint (Branch left node right) = unlines (prettyprint_helper (Branch left node right n h))
    
    prettyprint_helper (Branch left node right)
        = (show node) : (prettyprint_subtree left right)
            where
                prettyprint_subtree left right =
                    ((pad "+- " "|  ") (prettyprint_helper right))
                        ++ ((pad "`- " "   ") (prettyprint_helper left))
                pad first rest = zipWith (++) (first : repeat rest)
    prettyprint_helper (Leaf)
        = []
    

    Which produces a tree like

    4
    +- 8
    |  +- 9
    |  |  +- 10
    |  `- 6
    |     +- 7
    |     `- 5
    `- 2
       +- 3
       `- 1
    

    I just wanted to explain how the pad function works, since that was the hardest for me to follow (called shift in the source).

    Firstly, zipWith applies a function (first argument) to "join" two lists. zipWith (+) [1, 2, 3] [4, 5, 6] return [5, 7, 9]. It stops when one of the lists is empty. zipWith applied to only one list returns a function that can be applied to zip a second list (I believe this is known as function currying). Here's a simpler version of the pad function:

    > let pad = zipWith (++) (repeat "   ")
    > :type pad
    pad :: [[Char]] -> [[Char]]
    > pad ["1", "2", "3"]
    ["   1", "   2", "   3"]
    

    Notice: 1. One of the lists is infinite (repeat " "), but it zipping stops when one of the lists is empty 2. zipWith only takes a function and a list. pad is then a function that takes a list of list of chars/strings and returns the zipped list of list of chars/strings. So you apply pad on a single list to zip it up with the first one

    Now let's look at

    prettyprint_subtree left right =
        ((pad "+- " "|  ") (prettyprint_helper left))
            ++ ((pad "`- " "   ") (prettyprint_helper right))
    

    (pad "+- " "| ") creates an infinite list like ["+- ", "| ", "| ", "| ", ...]. (prettyprint_helper right) builds the list of lines that represents the subtree on the right, starting with the right's root node. But that whole tree needs to be shifted to the right; we do that by zipping it with the padding. We use an infinite list because we don't know how large the subtree is; there will always be enough "| "s to pad the extra lines (this also works because of lazy evaluation). Note that the first line; i.e. the subtree-root-node, is padded with "+- " instead, the "notation" for a right node.

    The left side is virtually the same. The notation for a left node is "`- ". The only other difference is the padding; " " instead of "| ". So why don't left nodes need the "branches"? Well, you can think of it as the behind the right nodes (padding is appended; on the left) going to the left nodes below. You need the padding behind the right to connect the left node/subtree to the parent node. There is nothing behind the left side of a tree, except possibly the parent tree. Which brings me to my last point; every subtree, represented as a list of lines in the prettyprint_helper function, gets an additional level of padding every parent tree up. I think it's best illustrated with an example.


    In creating the tree above (note, I don't know exactly the execution order, especially with lazy evaluation, but this is just to help visualize why it works):

    Let's say we recurse down to 10. Well the subtree on the left and the subtree on the right is empty, so prettyprint_helper (Branch Leaf 10 Leaf) returns ["10"].

    Now we're up to 9. It's subtree is: "9" : ([] ++ ((pad "+- " "| ") [10])) (no left side), or "9" : ["+- 10"], or:

    9
    +- 10
    

    Now we're up to 8. ((pad "+- " "| ") (prettyprint_helper right)) creates:

    +- 9
    |  +- 10
    

    You can trace it yourself but the left side is:

    6
    +- 7
    `- 5
    

    Which pads to (first element "`- ", rest " "):

    `- 6
       +- 7
       `- 5
    

    So altogether for 8, which is the left side appended to the right side, we have:

    8
    +- 9
    |  +- 10
    `- 6
       +- 7
       `- 5
    

    If we go one step up, this 8 subtree is padded for the 4 tree, and again you can trace through the other side to verify it works. You get

    +- 8
    |  +- 9
    |  |  +- 10
    |  `- 6
    |     +- 7
    |     `- 5
    

    And the final result is as above. Remember, throughout this process the tree is represented as a list of lines. Only at the very end does it get put together with unlines. Perhaps my drawings are misleading because it may look like subtrees are being passed around as multiline strings. Once you understand this, it is very easy to add the extra branch ("|") between the left and right nodes like in Data.Tree's drawTree function. I'll let you figure it out :)

    My apologies if this excessive; it was quite difficult for me to understand from the source as a beginner, and this was a big jump for me. I hope it helps someone else trying to solve this problem.

提交回复
热议问题