Understanding GenericTraversableTemplate and other Scala collection internals

后端 未结 1 1387
感动是毒
感动是毒 2021-02-19 23:43

I was exchanging emails with an acquaintance that is a big Kotlin, Clojure and Java8 fan and asked him why not Scala. He provided many reasons (Scala is too academic, too many f

相关标签:
1条回答
  • 2021-02-20 00:04

    I will try to describe the concepts from the point of view of a random pedestrian (I've never contributed a single line to the Scala collection library, so don't hit me too hard if I'm wrong).

    Since LinkedList is now deprecated, and because Maps provide a better example, I will use TreeMap as example.

    CanBuildFrom

    The motivation is this: If we take a TreeMap[Int, Int] and map it with

    case (x, y) => (2 * x, y * y * 0.3d)
    

    we get TreeMap[Int, Double]. This type safety alone would already explain the necessity for simple genericBuilder[X] constructs. However, if we map it with

    case (x, y) => x
    

    we obtain an Iterable[Int] (more precisely: a List[Int]), this is no longer a Map, the type of the container has changed. This is where CBF's come into play:

    CanBuildFrom[This, X, That]
    

    can be seen as a kind of "type-level function" that tells us: if we map a collection of type This with a function that returns values of type X, we can build a That. The most specific CBF is provided at compile time, in the first case it will be something like

    CanBuildFrom[TreeMap[_,_], (X,Y), TreeMap[X,Y]]
    

    in the second case it will be something like

    CanBuildFrom[TreeMap[_,_], X, Iterable[X]]
    

    and so we always get the right type of the container. The pattern is pretty general. Every time you have a generic function

    foo[X1, ..., Xn](x1: X1, ..., xn: Xn): Y 
    

    where the result type Y depends on X1, ..., Xn, you can introduce an implicit parameter as follows:

    foo[X1, ...., Xn, Y](x1: X1, ..., xn: Xn)(implicit CanFooFrom[X1, ..., Xn, Y]): Y
    

    and then define the type-level function X1, ..., Xn -> Y piecewise by providing multiple implicit CanFooFrom's.

    LinkedListLike

    In the class definition, we see something like this:

    TreeMap[A, B] extends SortedMap[A, B] with SortedMapLike[A, B, TreeMap[A, B]]
    

    This is Scala's way to express the so-called F-bounded polymorphism. The motivation is as follows: Suppose we have a dozen (or at least two...) implementations of the trait SortedMap[A, B]. Now we want to implement a method withoutHead, it could look somewhat like this:

    def withoutHead = this.remove(this.head)
    

    If we move the implementation into SortedMap[A, B] itself, the best we can do is this:

    def withoutHead: SortedMap[A, B] = this.remove(this.head)
    

    But is this the most specific result type we can get? No, that's too vague. We would like to return TreeMap[A, B] if the original map is a TreeMap, and CrazySortedLinkedHashMap (or whatever...) if the original was a CrazySortedLinkedHashMap. This is why we move the implementation into SortedMapLike, and give the following signature to the withoutHead method:

    trait SortedMapLike[A, B, Repr <: SortedMap[A, B]] {
      ...
      def withoutHead: Repr = this.remove(this.head)
    }
    

    now because TreeMap[A, B] extends SortedMapLike[A, B, TreeMap[A, B]], the result type of withoutHead is TreeMap[A,B]. The same holds for CrazySortedLinkedHashMap: we get the exact type back. In Java, you would either have to return SortedMap[A, B] or override the method in each subclass (which turned out to be a maintenance nightmare for the feature-rich traits in Scala)

    GenericTraversableTemplate

    The type is: GenericTraversableTemplate[+A, +CC[X] <: GenTraversable[X]]

    As far as i can tell, this is just a trait that provides implementations of methods that somehow return regular collections with same container type but possibly different content type (stuff like flatten, transpose, unzip).

    Stuff like foldLeft, reduce, exists are not here because these methods care only about content type, not container type.

    Stuff like flatMap is not here, because the container type can change (again, CBF's).

    Why is it a separate trait, is there a fundamental reason why it exists? I don't think so... It probably would be possible to group the godzillion of methods somewhat differently. But this is just what happens naturally: you start to implement a trait, and it turns out that it has very many methods. So instead you group loosely related methods, and put them into 10 different traits with awkward names like "GenTraversableTemplate", and them mix them all into traits/classes where you need them...

    GenericCompanion

    This is just an abstract class that implements some basic functionality which is common for companion objects of most collection classes (essentially, it just implements very simple factory methods apply(varargs) and empty).

    For example there is method apply that takes varargs of some type A and returns a collection of type CC[A]:

    Array(1, 2, 3, 4) // calls Array.apply[A](elems: A*) on the companion object
    List(1, 2, 3, 4) // same for List
    

    The implementation is very simple, it's something like this:

    def apply[A](varargs: A*): CC[A] = {
      val builder = newBuilder[A]
      for (arg <- varargs) builder += arg
      builder.result()
    }
    

    This is obviously the same for Arrays and Lists and TreeMaps and almost everything else, except 'constrained irregular Collections' like Bitset. So this is just common functionality in a common ancestor class of most companion objects. Nothing special about that.

    SeqFactory

    Similar to GenericCompanion, but this time more specifically for Sequences. Adds some common factory methods like fill() and iterate() and tabulate() etc. Again, nothing particularly rocket-scientific here...

    Few general remarks

    In general: I don't think that one should attempt to understand every single trait in this library. Rather, one should try to look at the library as a whole. As a whole, it has a very interesting architecture. And in my personal opinion, it's actually a very aesthetic piece of software, but one has to stare at it for quite a while (and try to re-implement the whole architectural pattern several times) to grasp it. On the other hand: for example CBF's are kind of "design pattern" that clearly should be eliminated in successors of this language. The whole story with the scope of implicit CBF's still seems like a total nightmare to me. But many things seemed completely inscrutable at first, and almost always, it ended with an epiphany (which is very specific for Scala: for the majority of other languages, such struggles usually end with the thought "Author of this is a complete idiot").

    0 讨论(0)
提交回复
热议问题