Transforming an iterator into an iterator of chunks of duplicates

心不动则不痛 提交于 2021-01-28 13:43:32


Suppose I am writing a function foo: Iterator[A] => Iterator[List[A]] to transform a given iterator into an iterator of chunks of duplicates :

def foo[T](it: Iterator[A]): Iterator[List[A]] = ???
foo("abbbcbbe".iterator) // List("a", "bbb", "c", "bb", "e")

In order to implement foo I want to reuse function splitDupes: Iterator[A] => (List[A], Iterator[A]) that splits an iterator into a prefix with duplicates and the rest (thanks a lot to Kolmar who suggested it here)

def splitDupes[A](it: Iterator[A]): (List[A], Iterator[A]) = {
  if (it.isEmpty) {
    (Nil, Iterator.empty)
  } else {
    val head =
    val (dupes, rest) = it.span(_ == head)
    (head +: dupes.toList, rest)

Now I am writing foo using splitDupes like that:

def foo[A](it: Iterator[A]): Iterator[List[A]] = {
   if (it.isEmpty) {
   } else {
     val (xs, ys) = Iterator.iterate(splitDupes(it))(x => splitDupes(x._2)).span(_._2.nonEmpty)
     (if (ys.hasNext) xs ++ Iterator( else xs).map(_._1)

This implementation seems working but it's look complicated and clumsy.
How would you improve the foo implementation above ?


You can do it like this:

def foo[A](it: Iterator[A]): Iterator[List[A]] = {
  Iterator.iterate(splitDupes(it))(x => splitDupes(x._2))

The empty case is already handled in splitDupes. You can safely keep calling splitDupes until it hits this empty case (that is, starts returning Nil in the first tuple element).

This works ok in all cases:

scala> foo("abbbcbbe".iterator)
res1: List[String] = List(a, bbb, c, bb, e)

scala> foo("".iterator)
res2: List[String] = List()

scala> foo("a".iterator)
res3: List[String] = List(a) 

scala> foo("aaa".iterator)
res4: List[String] = List(aaa)

scala> foo("abc".iterator)
res5: List[String] = List(a, b, c)

