问题
Sometimes, I find myself wishing scala collections to include some missing functionality, and it's rather easy "extending" a collection, and provide a custom method.
This is a bit more difficult when it comes to building the collection from scratch.
Consider useful methods such as .iterate
.
I'll demonstrate the usecase with a similar, familiar function: unfold
.
unfold
is a method to construct a collection from an initial state z: S
, and a function to generate an optional tuple of the next state, and an element E
, or an empty option indicating the end.
the method signature, for some collection type Coll[T]
should look roughly like:
def unfold[S,E](z: S)(f: S ⇒ Option[(S,E)]): Coll[E]
Now, IMO, the most "natural" usage should be, e.g:
val state: S = ??? // initial state
val arr: Array[E] = Array.unfold(state){ s ⇒
// code to convert s to some Option[(S,E)]
???
}
This is pretty straight forward to do for a specific collection type:
implicit class ArrayOps(arrObj: Array.type) {
def unfold[S,E : ClassTag](z: S)(f: S => Option[(S,E)]): Array[E] = {
val b = Array.newBuilder[E]
var s = f(z)
while(s.isDefined) {
val Some((state,element)) = s
b += element
s = f(state)
}
b.result()
}
}
with this implicit class in scope, we can generate an array for the Fibonacci seq like this:
val arr: Array[Int] = Array.unfold(0->1) {
case (a,b) if a < 256 => Some((b -> (a+b)) -> a)
case _ => None
}
But if we want to provide this functionality to all other collection types, I see no other option than to C&P the code, and replace all Array
occurrences with List
,Seq
,etc'...
So I tried another approach:
trait BuilderProvider[Elem,Coll] {
def builder: mutable.Builder[Elem,Coll]
}
object BuilderProvider {
object Implicits {
implicit def arrayBuilderProvider[Elem : ClassTag] = new BuilderProvider[Elem,Array[Elem]] {
def builder = Array.newBuilder[Elem]
}
implicit def listBuilderProvider[Elem : ClassTag] = new BuilderProvider[Elem,List[Elem]] {
def builder = List.newBuilder[Elem]
}
// many more logicless implicits
}
}
def unfold[Coll,S,E : ClassTag](z: S)(f: S => Option[(S,E)])(implicit bp: BuilderProvider[E,Coll]): Coll = {
val b = bp.builder
var s = f(z)
while(s.isDefined) {
val Some((state,element)) = s
b += element
s = f(state)
}
b.result()
}
Now, with the above in scope, all one needs is an import for the right type:
import BuilderProvider.Implicits.arrayBuilderProvider
val arr: Array[Int] = unfold(0->1) {
case (a,b) if a < 256 => Some((b -> (a+b)) -> a)
case _ => None
}
but this doesn't fell right also. I don't like forcing the user to import something, let alone an implicit method that will create a useless wiring class on every method call. Moreover, there is no easy way to override the default logic. You can think about collections such as Stream
, where it would be most appropriate to create the collection lazily, or other special implementation details to consider regarding other collections.
The best solution I could come up with, was to use the first solution as a template, and generate the sources with sbt:
sourceGenerators in Compile += Def.task {
val file = (sourceManaged in Compile).value / "myextensions" / "util" / "collections" / "package.scala"
val colls = Seq("Array","List","Seq","Vector","Set") //etc'...
val prefix = s"""package myextensions.util
|
|package object collections {
|
""".stripMargin
val all = colls.map{ coll =>
s"""
|implicit class ${coll}Ops[Elem](obj: ${coll}.type) {
| def unfold[S,E : ClassTag](z: S)(f: S => Option[(S,E)]): ${coll}[E] = {
| val b = ${coll}.newBuilder[E]
| var s = f(z)
| while(s.isDefined) {
| val Some((state,element)) = s
| b += element
| s = f(state)
| }
| b.result()
| }
|}
""".stripMargin
}
IO.write(file,all.mkString(prefix,"\n","\n}\n"))
Seq(file)
}.taskValue
But this solution suffers from other issues, and is hard to maintain. just imagine if unfold
is not the only function to add globally, and overriding default implementation is still hard. bottom line, this is hard to maintain and does not "feel" right either.
So, is there a better way to achieve this?
回答1:
First, let's make a basic implementation of the function, which uses an explicit Builder
argument. In case of unfold it can look like this:
import scala.language.higherKinds
import scala.annotation.tailrec
import scala.collection.GenTraversable
import scala.collection.mutable
import scala.collection.generic.{GenericCompanion, CanBuildFrom}
object UnfoldImpl {
def unfold[CC[_], E, S](builder: mutable.Builder[E, CC[E]])(initial: S)(next: S => Option[(S, E)]): CC[E] = {
@tailrec
def build(state: S): CC[E] = {
next(state) match {
case None => builder.result()
case Some((nextState, elem)) =>
builder += elem
build(nextState)
}
}
build(initial)
}
}
Now, what can be an easy way to get a builder of a collection by its type?
I can propose two possibile solutions. The first is to make an implicit extension class, that extends a GenericCompanion – the common superclass of most scala's built-in collections. This GenericCompanion
has a method newBuilder
that returns a Builder
for the provided element type. An implementation may look like this:
implicit class Unfolder[CC[X] <: GenTraversable[X]](obj: GenericCompanion[CC]) {
def unfold[S, E](initial: S)(next: S => Option[(S, E)]): CC[E] =
UnfoldImpl.unfold(obj.newBuilder[E])(initial)(next)
}
And it's very easy to use this:
scala> List.unfold(1)(a => if (a > 10) None else Some(a + 1, a * a))
res1: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
One drawback is that some collections don't have companion objects extending GenericCompanion
. For example, Array
, or user-defined collections.
Another possible solution is to use an implicit 'builder provider', like you have proposed. And scala already has such a thing in the collection library. It's CanBuildFrom. An implementation with a CanBuildFrom
may look like this:
object Unfolder2 {
def apply[CC[_]] = new {
def unfold[S, E](initial: S)(next: S => Option[(S, E)])(
implicit cbf: CanBuildFrom[CC[E], E, CC[E]]
): CC[E] =
UnfoldImpl.unfold(cbf())(initial)(next)
}
}
Usage example:
scala> Unfolder2[Array].unfold(1)(a => if (a > 10) None else Some(a + 1, a * a))
res1: Array[Int] = Array(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
This works with scala's collections, Array
, and may work with user-defined collections, if the user has provided a CanBuildFrom
instance.
Note, that both approaches won't work with Stream
s in a lazy fashion. That's mostly because the original implementation UnfoldImpl.unfold
uses a Builder
, which for a Stream
is eager.
To do something like unfolding for Stream
lazily, you can't use the standard Builder
. You'd have to provide a separate implementation using Stream.cons
(or #::
). To be able to choose an implementation automatically, depending on the collection type requested by user, you can use the typeclass pattern. Here is a sample implementation:
trait Unfolder3[E, CC[_]] {
def unfold[S](initial: S)(next: S => Option[(S, E)]): CC[E]
}
trait UnfolderCbfInstance {
// provides unfolder for types that have a `CanBuildFrom`
// this is used only if the collection is not a `Stream`
implicit def unfolderWithCBF[E, CC[_]](
implicit cbf: CanBuildFrom[CC[E], E, CC[E]]
): Unfolder3[E, CC] =
new Unfolder3[E, CC] {
def unfold[S](initial: S)(next: S => Option[(S, E)]): CC[E] =
UnfoldImpl.unfold(cbf())(initial)(next)
}
}
object Unfolder3 extends UnfolderCbfInstance {
// lazy implementation, that overrides `unfolderWithCbf` for `Stream`s
implicit def streamUnfolder[E]: Unfolder3[E, Stream] =
new Unfolder3[E, Stream] {
def unfold[S](initial: S)(next: S => Option[(S, E)]): Stream[E] =
next(initial).fold(Stream.empty[E]) {
case (state, elem) =>
elem #:: unfold(state)(next)
}
}
def apply[CC[_]] = new {
def unfold[E, S](initial: S)(next: S => Option[(S, E)])(
implicit impl: Unfolder3[E, CC]
): CC[E] = impl.unfold(initial)(next)
}
}
Now this implementation works eagerly for normal collections (including Array
and user-defined collections with appropriate CanBuildFrom
), and lazily for Stream
s:
scala> Unfolder3[Array].unfold(1)(a => if (a > 10) None else Some(a + 1, a * a))
res0: Array[Int] = Array(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
scala> com.Main.Unfolder3[Stream].unfold(1)(a => if (a > 10) None else { println(a); Some(a + 1, a * a) })
1
res2: Stream[Int] = Stream(1, ?)
scala> res2.take(3).toList
2
3
res3: List[Int] = List(1, 4, 9)
Note, that if Unfolder3.apply
is moved to another object or class, the user won't have to import anything to do with Unfolder3
at all.
If you don't understand how this implementation works you can read something about the typeclass patern in Scala, and the order of implicit resolution.
来源:https://stackoverflow.com/questions/35682984/generic-collection-generation-with-a-generic-type