I was browsing around and found a question about grouping a String
by it\'s characters, such as this:
The input:
\"aaabbbccccdd\"
First, let's see what happens when you iterate over a String:
scala> "asdf".toList
res1: List[Char] = List(a, s, d, f)
Next, consider that sometimes we want to group elements on the basis of some specific attribute of an object.
For instance, we might group a list of strings by length as in...
List("aa", "bbb", "bb", "bbb").groupBy(_.length)
What if you just wanted to group each item by the item itself. You could pass in the identity function like this:
List("aa", "bbb", "bb", "bbb").groupBy(identity)
You could do something silly like this, but it would be silly:
List("aa", "bbb", "bb", "bbb").groupBy(_.toString)
Take a look at
str.groupBy(identity)
which returns
scala.collection.immutable.Map[Char,String] = Map(b -> bbb, d -> dd, a -> aaa, c -> cccc)
so the key by which the elements are grouped by is the character.
Basically list.groupBy(identity) is just a fancy way of saying list.groupBy(x => x), which in my opinion is clearer. It groups a list containing duplicate items by those items.
To understand this just call scala repl with -Xprint:typer
option:
val res2: immutable.Map[Char,String] = augmentString(str).groupBy[Char]({
((x: Char) => identity[Char](x))
});
Scalac converts a simple String
into StringOps
with is a subclass of TraversableLike
which has a groupBy
method:
def groupBy[K](f: A => K): immutable.Map[K, Repr] = {
val m = mutable.Map.empty[K, Builder[A, Repr]]
for (elem <- this) {
val key = f(elem)
val bldr = m.getOrElseUpdate(key, newBuilder)
bldr += elem
}
val b = immutable.Map.newBuilder[K, Repr]
for ((k, v) <- m)
b += ((k, v.result))
b.result
}
So groupBy contains a map into which inserts chars return by identity function.
This is your expression:
val list = str.groupBy(identity).toList.sortBy(_._1).map(_._2)
Let's go item by function by function. The first one is groupBy, which will partition your String using the list of keys passed by the discriminator function, which in your case is identity. The discriminator function will be applied to each character in the screen and all characters that return the same result will be grouped together. If we want to separate the letter a from the rest we could use x => x == 'a'
as our discriminator function. That would group your string chars into the return of this function (true or false) in map:
Map(false -> bbbccccdd, true -> aaa)
By using identity
, which is a "nice" way to say x => x
, we get a map where each character gets separated in map, in your case:
Map(c -> cccc, a -> aaa, d -> dd, b -> bbb)
Then we convert the map to a list of tuples (char,String)
with toList
.
Order it by char with sortBy
and just keep the String with the map
getting your final result.
Whenever you try to use methods such as groupBy
on the String. It's important to note that it is implicitly converted to StringOps and not List[Char].
The signature of groupBy
is given by-
def groupBy[K](f: (Char) ⇒ K): Map[K, String]
Hence, the result is in the form -
Map[Char,String]
The signature of groupBy
is given by-
def groupBy[K](f: (Char) ⇒ K): Map[K, List[Char]]
If it had been implicitly converted to List[Char] the result would be of the form -
Map[Char,List[Char]]
Now this should implicitly answer your curious question, as how scala figured out to groupBy
on Char
(see the signature) and yet give you Map[Char, String]
.