Generate a DAG from a poset using stricly functional programming

試著忘記壹切 提交于 2019-12-03 10:55:16

After some work, I finally ended up solving my problem, following my initial intuition. The collect method and rank evaluation were flawed, I rewrote them with tail-recursion as a bonus. Here is the code I obtained:

final case class HNode[A](
  val v: A,
  val child: List[HNode[A]]) {
  val rank: Int = 1 + count(child, Set.empty)

  @tailrec
  private def count(stack: List[HNode[A]], c: Set[HNode[A]]): Int =
    if (stack == Nil) c.size
    else {
      val head :: rem = stack
      if (c(head)) count(rem, c)
      else count(head.child ::: rem, c + head)
    }
}

// ...

  private def add(v: A, roots: List[HNode[A]]): List[HNode[A]] = {
    val newNode = HNode(v, collect(v, roots, Nil))
    attach(newNode, roots)
  }

  private def attach(n: HNode[A], roots: List[HNode[A]]): List[HNode[A]] =
    if (roots.contains(n)) roots
    else {
      val (supersets, remaining) = roots.partition { r =>
        // Strict superset to avoid creating cycles in case of equal elements
        po.tryCompare(n.v, r.v) == Some(-1)
      }
      if (supersets.isEmpty) n :: remaining.filter(r => !po.lteq(r.v, n.v))
      else {
        supersets.map(s => HNode(s.v, attach(n, s.child))) ::: remaining
      }
    }

  @tailrec
  private def collect(v: A, stack: List[HNode[A]], collected: List[HNode[A]]): List[HNode[A]] =
    if (stack == Nil) collected
    else {
      val head :: tail = stack

      if (collected.exists(c => po.lteq(head.v, c.v))) collect(v, tail, collected)
      else if (po.lteq(head.v, v)) collect(v, tail, head :: (collected.filter(c => !po.lteq(c.v, head.v))))
      else collect(v, head.child ::: tail, collected)
    }

I now must check some optimization: - cut off branches with totally distinct sets when collecting subsets (as Rex Kerr suggested) - see if sorting the sets by size improves the process (as mitchus suggested)

The following problem is to work the (worst case) complexity of the add() operation out. With n the number of sets, and d the size of the largest set, the complexity will probably be O(n²d), but I hope it can be refined. Here is my reasoning: if all sets are distinct, the DAG will be reduced to a sequence of roots/leaves. Thus, every time I try to add a node to the data structure, I still have to check for inclusion with each node already present (both in collect and attach procedures). This leads to 1 + 2 + … + n = n(n+1)/2 ∈ O(n²) inclusion checks.

Each set inclusion test is O(d), hence the result.

Suppose your DAG G contains a node v for each set, with attributes v.s (the set) and v.count (the number of instances of the set), including a node G.root with G.root.s = union of all sets (where G.root.count=0 if this set never occurs in your collection).

Then to count the number of distinct subsets of s you could do the following (in a bastardized mixture of Scala, Python and pseudo-code):

sum(apply(lambda x: x.count, get_subsets(s, G.root)))

where

get_subsets(s, v) :
   if(v.s is not a subset of s, {}, 
      union({v} :: apply(v.children, lambda x: get_subsets(s, x))))

In my opinion though, for performance reasons you would be better off abandoning this kind of purely functional solution... it works well on lists and trees, but beyond that the going gets tough.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!