abstracting over a collection

问题

Recently, I wrote an iterator for a cartesian product of Anys, and started with a List of List, but recognized, that I can easily switch to the more abstract trait Seq.

I know, you like to see the code. :)

class Cartesian (val ll: Seq[Seq[_]]) extends Iterator [Seq[_]] {

  def combicount: Int = (1 /: ll) (_ * _.length)

  val last = combicount
  var iter = 0

  override def hasNext (): Boolean = iter < last
  override def next (): Seq[_] = {
    val res = combination (ll, iter)
    iter += 1
    res
  }

  def combination (xx: Seq [Seq[_]], i: Int): List[_] = xx match {
      case Nil     => Nil
      case x :: xs => x (i % x.length) :: combination (xs, i / x.length) 
  }
}

And a client of that class:

object Main extends Application {
  val illi = new Cartesian (List ("abc".toList, "xy".toList, "AB".toList))
  // val ivvi = new Cartesian (Vector (Vector (1, 2, 3), Vector (10, 20)))
  val issi = new Cartesian (Seq (Seq (1, 2, 3), Seq (10, 20)))
  // val iaai = new Cartesian (Array (Array (1, 2, 3), Array (10, 20)))

  (0 to 5).foreach (dummy => println (illi.next ()))
  // (0 to 5).foreach (dummy => println (issi.next ()))
}
/*
List(a, x, A)
List(b, x, A)
List(c, x, A)
List(a, y, A)
List(b, y, A)
List(c, y, A)
*/

The code works well for Seq and Lists (which are Seqs), but of course not for Arrays or Vector, which aren't of type Seq, and don't have a cons-method '::'.

But the logic could be used for such collections too.

I could try to write an implicit conversion to and from Seq for Vector, Array, and such, or try to write an own, similar implementation, or write an Wrapper, which transforms the collection to a Seq of Seq, and calls 'hasNext' and 'next' for the inner collection, and converts the result to an Array, Vector or whatever. (I tried to implement such workarounds, but I have to recognize: it's not that easy. For a real world problem I would probably rewrite the Iterator independently.)

However, the whole thing get's a bit out of control if I have to deal with Arrays of Lists or Lists of Arrays and other mixed cases.

What would be the most elegant way to write the algorithm in the broadest, possible way?

回答1:

There are two solutions. The first is to not require the containers to be a subclass of some generic super class, but to be convertible to one (by using implicit function arguments). If the container is already a subclass of the required type, there's a predefined identity conversion which only returns it.

import collection.mutable.Builder
import collection.TraversableLike
import collection.generic.CanBuildFrom
import collection.mutable.SeqLike

class Cartesian[T, ST[T], TT[S]](val ll: TT[ST[T]])(implicit cbf: CanBuildFrom[Nothing, T, ST[T]], seqLike: ST[T] =>  SeqLike[T, ST[T]], traversableLike: TT[ST[T]] => TraversableLike[ST[T], TT[ST[T]]] ) extends Iterator[ST[T]] {

  def combicount (): Int = (1 /: ll) (_ * _.length)

  val last = combicount - 1 
  var iter = 0

  override def hasNext (): Boolean = iter < last
  override def next (): ST[T] = {
    val res = combination (ll, iter, cbf())
    iter += 1
    res
  }

  def combination (xx: TT[ST[T]], i: Int, builder: Builder[T, ST[T]]): ST[T] = 
    if (xx.isEmpty) builder.result
    else  combination (xx.tail, i / xx.head.length, builder += xx.head (i % xx.head.length) ) 
}

This sort of works:

scala> new Cartesian[String, Vector, Vector](Vector(Vector("a"), Vector("xy"), Vector("AB")))
res0: Cartesian[String,Vector,Vector] = empty iterator

scala> new Cartesian[String, Array, Array](Array(Array("a"), Array("xy"), Array("AB")))
res1: Cartesian[String,Array,Array] = empty iterator

I needed to explicitly pass the types because of bug https://issues.scala-lang.org/browse/SI-3343

One thing to note is that this is better than using existential types, because calling next on the iterator returns the right type, and not Seq[Any].

There are several drawbacks here:

If the container is not a subclass of the required type, it is converted to one, which costs in performance
The algorithm is not completely generic. We need types to be converted to SeqLike or TraversableLike only to use a subset of functionality these types offer. So making a conversion function can be tricky.
What if some capabilities can be interpreted differently in different contexts? For example, a rectangle has two 'length' properties (width and height)

Now for the alternative solution. We note that we don't actually care about the types of collections, just their capabilities:

TT should have foldLeft, get(i: Int) (to get head/tail)
ST should have length, get(i: Int) and a Builder

So we can encode these:

trait HasGet[T, CC[_]]  {
  def get(cc: CC[T], i: Int): T
}

object HasGet {
  implicit def seqLikeHasGet[T, CC[X] <: SeqLike[X, _]] = new HasGet[T, CC] {
    def get(cc: CC[T], i: Int): T = cc(i)
  }

  implicit def arrayHasGet[T] = new HasGet[T, Array] {
    def get(cc: Array[T], i: Int): T = cc(i)
  }
}

trait HasLength[CC] {
  def length(cc: CC): Int 
}

object HasLength {
  implicit def seqLikeHasLength[CC <: SeqLike[_, _]] = new HasLength[CC] {
    def length(cc: CC) = cc.length
  }

  implicit def arrayHasLength[T] = new HasLength[Array[T]] {
    def length(cc: Array[T]) = cc.length
  }

}   

trait HasFold[T, CC[_]] {
  def foldLeft[A](cc: CC[T], zero: A)(op: (A, T) => A): A
}

object HasFold {
  implicit def seqLikeHasFold[T, CC[X] <: SeqLike[X, _]] = new HasFold[T, CC] {
    def foldLeft[A](cc: CC[T], zero: A)(op: (A, T) => A): A = cc.foldLeft(zero)(op)
  }
  implicit def arrayHasFold[T] = new HasFold[T, Array] {
    def foldLeft[A](cc: Array[T], zero: A)(op: (A, T) => A): A =  {
      var i = 0
      var result = zero
      while (i < cc.length) {
        result = op(result, cc(i))
        i += 1
      }
      result
    }
  }
}

(strictly speaking, HasFold is not required since its implementation is in terms of length and get, but i added it here so the algorithm will translate more cleanly)

now the algorithm is:

class Cartesian[T, ST[_], TT[Y]](val ll: TT[ST[T]])(implicit cbf: CanBuildFrom[Nothing, T, ST[T]], stHasLength: HasLength[ST[T]], stHasGet: HasGet[T, ST], ttHasFold: HasFold[ST[T], TT], ttHasGet: HasGet[ST[T], TT], ttHasLength: HasLength[TT[ST[T]]]) extends Iterator[ST[T]] {

  def combicount (): Int = ttHasFold.foldLeft(ll, 1)((a,l) => a * stHasLength.length(l))

  val last = combicount - 1 
  var iter = 0

  override def hasNext (): Boolean = iter < last
  override def next (): ST[T] = {
    val res = combination (ll, 0, iter, cbf())
    iter += 1
    res
  }

  def combination (xx: TT[ST[T]], j: Int,  i: Int, builder: Builder[T, ST[T]]): ST[T] = 
    if (ttHasLength.length(xx) == j) builder.result
    else  {
      val head = ttHasGet.get(xx, j)
      val headLength = stHasLength.length(head)
      combination (xx, j + 1, i / headLength, builder += stHasGet.get(head, (i % headLength) )) 
    }
}

And use:

scala> new Cartesian[String, Vector, List](List(Vector("a"), Vector("xy"), Vector("AB")))
res6: Cartesian[String,Vector,List] = empty iterator

scala> new Cartesian[String, Array, Array](Array(Array("a"), Array("xy"), Array("AB")))
res7: Cartesian[String,Array,Array] = empty iterator

Scalaz probably has all of this predefined for you, unfortunately, I don't know it well.

(again I need to pass the types because inference doesn't infer the right kind)

The benefit is that the algorithm is now completely generic and that there is no need for implicit conversions from Array to WrappedArray in order for it to work

Excercise: define for tuples ;-)

来源：https://stackoverflow.com/questions/6140209/abstracting-over-a-collection

标签

scala

collections

abstraction