问题
Say I have a tree like this. I would like to obtain the paths to child nodes that only contain leaves and not non-leaf child nodes.
So for this tree
root
├──leaf123
├──level_a_node1
│ ├──leaf456
├──level_a_node2
│ ├──level_b_node1
│ │ └──leaf987
│ └──level_b_node2
│ └──level_c_node1
| └── leaf654
├──leaf789
└──level_a_node3
└──leaf432
The result would be
[["root" "level_a_node1"]
["root" "level_a_node2" "level_b_node1"]
["root" "level_a_node2" "level_b_node2" "level_c_node1"]
["root" "level_a_node3"]]
I've attempted to go down to the bottom nodes and check if the (lefts)
and the (rights)
are not branches, but that that doesn't quite work.
(z/vector-zip ["root"
["level_a_node3" ["leaf432"]]
["level_a_node2" ["level_b_node2" ["level_c_node1" ["leaf654"]]] ["level_b_node1" ["leaf987"]] ["leaf789"]]
["level_a_node1" ["leaf456"]]
["leaf123"]])
edit: my data is actually coming in as a list of paths and I'm converting that into a tree. But maybe that is an overcomplication?
[["root" "leaf"]
["root" "level_a_node1" "leaf"]
["root" "level_a_node2" "leaf"]
["root" "level_a_node2" "level_b_node1" "leaf"]
["root" "level_a_node2" "level_b_node2" "level_c_node1" "leaf"]
["root" "level_a_node3" "leaf"]]
回答1:
Hiccup-style structures are a nice place to visit, but I wouldn't want to live there. That is, they're very succinct to write, but a giant pain to manipulate programmatically, because the semantic nesting structure is not reflected in the physical structure of the nodes. So, the first thing I would do is convert to Enlive-style tree representation (or, ideally, generate Enlive to begin with):
(def hiccup
["root"
["level_a_node3" ["leaf432"]]
["level_a_node2"
["level_b_node2"
["level_c_node1"
["leaf654"]]]
["level_b_node1"
["leaf987"]]
["leaf789"]]
["level_a_node1"
["leaf456"]]
["leaf123"]])
(defn hiccup->enlive [x]
(when (vector? x)
{:tag (first x)
:content (map hiccup->enlive (rest x))}))
(def enlive (hiccup->enlive hiccup))
;; Yielding...
{:tag "root",
:content
({:tag "level_a_node3", :content ({:tag "leaf432", :content ()})}
{:tag "level_a_node2",
:content
({:tag "level_b_node2",
:content
({:tag "level_c_node1",
:content ({:tag "leaf654", :content ()})})}
{:tag "level_b_node1", :content ({:tag "leaf987", :content ()})}
{:tag "leaf789", :content ()})}
{:tag "level_a_node1", :content ({:tag "leaf456", :content ()})}
{:tag "leaf123", :content ()})}
Having done this, the last thing getting in your way is your desire to use zippers. They are a good tool for targeted traversals, where you care a lot about the structure near the node you are working on. But if all you care about is the node and its children, it is much easier to just write a simple recursive function to traverse the tree:
(defn paths-to-leaves [{:keys [tag content] :as root}]
(when (seq content)
(if (every? #(empty? (:content %)) content)
[(list tag)]
(for [child content
path (paths-to-leaves child)]
(cons tag path)))))
The ability to write recursive traversals like this is a skill that will serve you many times throughout your Clojure career (for example, a similar question I recently answered on Code Review). It turns out that a huge number of functions on trees are just: call yourself recursively on each child, and somehow combine the results, usually in a possibly-nested for
loop. The hard part is just figuring out what your base case needs to be, and the correct sequence of maps/mapcats to combine the results without introducing undesired levels of nesting.
If you insist on sticking with Hiccup, you can de-mangle it at the use site without too much pain:
(defn hiccup-paths-to-leaves [node]
(when (vector? node)
(let [tag (first node), content (next node)]
(if (and content (every? #(= 1 (count %)) content))
[(list tag)]
(for [child content
path (hiccup-paths-to-leaves child)]
(cons tag path))))))
But it's noticeably messier, and is work you'll have to repeat every time you work with a tree. Again I encourage you to use Enlive-style trees for your internal data representation.
回答2:
You can definitely use the file api to navigate the directory. If using zipper, you can do this:
(loop [loc (vector-zip ["root"
["level_a_node3"
["leaf432"]]
["level_a_node2"
["level_b_node2"
["level_c_node1"
["leaf654"]]]
["level_b_node1"
["leaf987"]]
["leaf789"]]
["level_a_node1"
["leaf456" "leaf456b"]]
["leaf123"]])
ans nil]
(if (end? loc)
ans
(recur (next loc)
(cond->> ans
(contains-leaves-only? loc)
(cons (->> loc down path (map node)))))))
which will output this:
(("root" "level_a_node1")
("root" "level_a_node2" "level_b_node1")
("root" "level_a_node2" "level_b_node2" "level_c_node1")
("root" "level_a_node3"))
with the way you define the tree, helper functions can be implemented as:
(def is-leaf? #(-> % down nil?))
(defn contains-leaves-only?
[loc]
(some->> loc
down ;; branch name
right ;; children list
down ;; first child
(iterate right) ;; with other sibiling
(take-while identity)
(every? is-leaf?)))
UPDATE - add a lazy sequence version
(->> ["root"
["level_a_node3"
["leaf432"]]
["level_a_node2"
["level_b_node2"
["level_c_node1"
["leaf654"]]]
["level_b_node1"
["leaf987"]]
["leaf789"]]
["level_a_node1"
["leaf456" "leaf456b"]]
["leaf123"]]
vector-zip
(iterate next)
(take-while (complement end?))
(filter contains-leaves-only?)
(map #(->> % down path (map node))))
回答3:
It is because zippers have so many limitations that I created the Tupelo Forest library for processing tree-like data structures. Your problem then has a simple solution:
(ns tst.tupelo.forest-examples
(:use tupelo.core tupelo.forest tupelo.test))
(with-forest (new-forest)
(let [data ["root"
["level_a_node3" ["leaf"]]
["level_a_node2"
["level_b_node2"
["level_c_node1"
["leaf"]]]
["level_b_node1" ["leaf"]]]
["level_a_node1" ["leaf"]]
["leaf"]]
root-hid (add-tree-hiccup data)
leaf-paths (find-paths-with root-hid [:** :*] leaf-path?)]
with a tree that looks like:
(hid->bush root-hid) =>
[{:tag "root"}
[{:tag "level_a_node3"}
[{:tag "leaf"}]]
[{:tag "level_a_node2"}
[{:tag "level_b_node2"}
[{:tag "level_c_node1"}
[{:tag "leaf"}]]]
[{:tag "level_b_node1"}
[{:tag "leaf"}]]]
[{:tag "level_a_node1"}
[{:tag "leaf"}]]
[{:tag "leaf"}]])
and a result like:
(format-paths leaf-paths) =>
[[{:tag "root"} [{:tag "level_a_node3"} [{:tag "leaf"}]]]
[{:tag "root"} [{:tag "level_a_node2"} [{:tag "level_b_node2"} [{:tag "level_c_node1"} [{:tag "leaf"}]]]]]
[{:tag "root"} [{:tag "level_a_node2"} [{:tag "level_b_node1"} [{:tag "leaf"}]]]]
[{:tag "root"} [{:tag "level_a_node1"} [{:tag "leaf"}]]]
[{:tag "root"} [{:tag "leaf"}]]]))))
There are many choices after this depending on the next steps in the processing chain.
来源:https://stackoverflow.com/questions/56030511/how-to-obtain-paths-to-all-the-child-nodes-in-a-tree-that-only-have-leaves-using