Is it always more performant to use withFilter instead of filter, when afterwards applying functions like map, flatmap etc.?
Why are only map, flatmap and fore
As a workaround, you can implement other functions with only map
and flatMap
.
Moreover, this optimisation is useless on small collections…
Using for yield can be a work around, for example:
for {
e <- col;
if e isNotEmpty
} yield e.get(0)
For the forall/exists part:
someList.filter(conditionA).forall(conditionB)
would be the same as (though a little bit un-intuitive)
!someList.exists(conditionA && !conditionB)
Similarly, .filter().exists() can be combined into one exists() check?
From the Scala docs:
Note: the difference between
c filter p
andc withFilter p
is that the former creates a new collection, whereas the latter only restricts the domain of subsequentmap
,flatMap
,foreach
, andwithFilter
operations.
So filter
will take the original collection and produce a new collection, but withFilter
will non-strictly (i.e. lazily) pass unfiltered values through to later map
/flatMap
/withFilter
calls, saving a second pass through the (filtered) collection. Hence it will be more efficient when passing through to these subsequent method calls.
In fact, withFilter
is specifically designed for working with chains of these methods, which is what a for comprehension is de-sugared into. No other methods (such as forall
/exists
) are required for this, so they have not been added to the FilterMonadic
return type of withFilter
.
In addition of the excellent answer of Shadowlands, I would like to bring an intuitive example of the difference between filter
and withFilter
.
Let's consider the following code
val list = List(1, 2, 3)
var go = true
val result = for(i <- list; if(go)) yield {
go = false
i
}
Most people expect result
to be equal to List(1)
. This is the case since Scala 2.8, because the for-comprehension is translated into
val result = list withFilter {
case i => go
} map {
case i => {
go = false
i
}
}
As you can see the translation converts the condition into a call to withFilter
. Prior Scala 2.8, for-comprehension were translated into something like the following:
val r2 = list filter {
case i => go
} map {
case i => {
go = false
i
}
}
Using filter
, the value of result
would be fairly different: List(1, 2, 3)
. The fact that we're making the go
flag false
has no effect on the filter, because the filter is already done. Again, in Scala 2.8, this issue is solved using withFilter
. When withFilter
is used, the condition is evaluated every time an element is accessed inside a map
method.
Reference: - p.120 ,Scala in action (covers Scala 2.10), Manning Publications, Milanjan Raychaudhuri - Odersky's thoughts about for-comprehension translation
The main reason because forall/exists aren't implemented is that the use case is that:
To implement forall/exists we need to obtain all the elements, loosing the lazyness.
So for example:
import scala.collection.AbstractIterator
class RandomIntIterator extends AbstractIterator[Int] {
val rand = new java.util.Random
def next: Int = rand.nextInt()
def hasNext: Boolean = true
}
//rand_integers is an infinite random integers iterator
val rand_integers = new RandomIntIterator
val rand_naturals =
rand_integers.withFilter(_ > 0)
val rand_even_naturals =
rand_naturals.withFilter(_ % 2 == 0)
println(rand_even_naturals.map(identity).take(10).toList)
//calling a second time we get
//another ten-tuple of random even naturals
println(rand_even_naturals.map(identity).take(10).toList)
Note that ten_rand_even_naturals is still an iterator. Only when we call toList the random numbers will be generated and filtered in chain
Note that map(identity) is equivalent to map(i=>i) and it is used here in order to convert a withFilter object back to the original type (eg a collection , a stream, an iterator)