Recursive set union: how does it work really?

核能气质少年 提交于 2019-12-17 23:05:28

问题


I am currently taking the Scala course on Coursera on my free time after work, in an attempt to finally give a try to functional programming. I am currently working on an assignment where we are supposed to "calculate" the union of two sets that contain some object. I am intentionally omitting details as it's not really important to what I am trying to ask here. What is relevant, however, is that the sets are defined as binary trees, with each node containing an element, and two subtrees.

That being the case; the example union in the lecture is as follows:

def union(other:BTSet) :BTSet = ((left union right) union other) incl element

Question1: Quite frankly, even after having read the relevant FAQ and other forum threads, I still don't understand how and why this function works. There is absolutely no "action" done here in union implementation besides adding (the incl call) the element at the head node, it simply calls itself over and over again. I would be very appreciative of some explanation...

Question2: The course forum contains many posts stating that this solution is not efficient at all, and that it is not good enough. Seeing as I don't understand how it works to begin with I don't really understand why it's not good enough.

PLEASE NOTE that I do not, in any way, ask for a spoiler for the assignment solution. I am more than willing to "do the work for the grade" but I simply don't understand what I am supposed to do here. I don't believe the instructions and guidance provided in the course are adequate to wrap your head around the quirks of functional programming, thus I welcome any comments/answers that address how to think right rather than how to code right.


回答1:


  A
 / \  union  D
B   C

((B union C) union D) incl A
  ^^^^^^^^^......................................assume it works

(  B             )
(    \   union D ) incl A
(     C          )

(((0 union C) union D) incl B) incl A
   ^^^^^^^^^.....................................just C

(((C union D) incl B) incl A
   ^^^^^^^^^.....................................expand

((((0 union 0) union D) incl C) incl B) incl A
    ^^^^^^^^^....................................just 0

(((0 union D) incl C) incl B) incl A
   ^^^^^^^^^.....................................just D

((D incl C) incl B) incl A
^^^^^^^^^^^^^^^^^^^^^^^^^^.......................all incl now

Just write it out step-by step. Now you see that union reduces to a bunch of incl statements applied to the right-hand argument.




回答2:


I gather that incl inserts an element into an existing set? If so, that's where all the real work is happening.

The definition of the union is the set that includes everything in either input set. Given two sets stored as binary trees, if you take the unions of the first set with the branches of the second, the only element in either that could be missing from the result is the element at the root node of the second tree, so if you insert that element you have the union of both input sets.

It's just a very inefficient way of inserting each element from both sets into a new set which starts out empty. Presumably duplicates are discarded by incl, so the result is the union of the two inputs.


Maybe it would help to ignore the tree structure for the moment; it's not really important to the essential algorithm. Say we have abstract mathematical sets. Given an input set with unknown elements, we can do two things things:

  • Add an element to it (which does nothing if the element was already present)
  • Check whether the set is non-empty and, if so, decompose it into a single element and two disjoint subsets.

To take the union of two sets {1,2} and {2,3}, we start by decomposing the first set into the element 1 and subsets {} and {2}. We recursively take the union of {}, {2}, and {2,3} using the same process, then insert 1 into the result.

At each step, the problem is reduced from one union operation to two union operations on smaller inputs; a standard divide-and-conquer algorithm. When reaching the union of a singleton set {x} and empty set {}, the union is trivially {x}, which is then returned back up the chain.

The tree structure is just used to both allow the case analysis/decomposition into smaller sets, and to make insertion more efficient. The same could be done using other data structures, such as lists that are split in half for decomposition and with insertion done by an exhaustive check for uniqueness. To take the union efficiently requires an algorithm that's a bit more clever, and takes advantage of the structure used to store the elements.




回答3:


So based on all the responses above, I think the real workhorse is incl and the recursive way of calling union is just for going through all the elements in the sets.

I came up with the following implementation of union, is this better?

def union(other:BTSet) :BTSet = right union (left union (other incl element))



回答4:


  2
 / \  union  4
1   3

((1 union 3) union 4) incl 2
  ^^^^^^^^^......................................assume it works

(((E union E) union 3 incl 1) union 4) incl 2
   ^^^^^^^^^.....................................still E

(E union E) union 3 incl 1 = E union 3 incl 1 = 3 incl 1

The following subtree should be 3 incl 1

(  3             ) 
(    \   union D ) incl 2
(      1         )


(((1 union E) union 4) incl 3) incl 2
   ^^^^^^^^^.......................................expand

(((( (E union E) union E) incl 1) union 4) incl 3) incl 2
      ^^^^^^^^^^^^^^^^^^^^^^^^^^..................still 1

((1 union 4) incl 3) incl 2
   ^^^^^^^^......................................continue

((((E union E) union 4) incl 1) incl 3) incl 2
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^..........expand 1 union 4

((4 incl 1) incl 3) incl 2
  ^^^^^^^^^^^^^^^^^^^^^^^^^............Final union result 

Thanks @Rex Kerr draws out the steps. I substitute the second step with the actual runtime step, which may give a more clear description of the Scala union function.




回答5:


You can't understand recursive algorithms unless you look at the base case. In fact, oftentimes, the key to understanding lies in understanding the base case first. Since the base case is not shown (probably because you didn't notice there is one in the first place) there is no understanding possible.




回答6:


I'm doing the same course, and the above implementation of union did turn out to be extremely inefficient.

I came up with the following not-so-functional solution to creating a union of binary-tree sets, which is WAY more efficient:

def union(that: BTSet): BTSet = {
  var result:BTSet = this
  that.foreach(element => result = result.incl(element))
  result
}


来源:https://stackoverflow.com/questions/16217304/recursive-set-union-how-does-it-work-really

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!