问题
I have a very simple snipper of Spark code which was working on Scala 2.11 and stop compiling after 2.12.
import spark.implicits._
val ds = Seq("val").toDF("col1")
ds.foreachPartition(part => {
part.foreach(println)
})
It fails with the error:
Error:(22, 12) value foreach is not a member of Object
part.foreach(println)
The workaround is to help the compiler with such code:
import spark.implicits._
val ds = Seq("val").toDF("col1")
println(ds.getClass)
ds.foreachPartition((part: Iterator[Row]) => {
part.foreach(println)
})
Does anyone have a good explanation on why the compiler cannot infer part
as an Iterator[Row]
.
ds
is a DataFrame which is defined as type DataFrame = Dataset[Row]
.
foreachPartition
has two signtures:
def foreachPartition(f: Iterator[T] => Unit): Unit
def foreachPartition(func: ForeachPartitionFunction[T]): Unit
Thank you for your help.
回答1:
This is to help someone facing the issue and workaround of what can be done to get around this issue.
You can convert Dataframe to rdd and then use foreachpartition and you will be able to compile and build your code.
ds.rdd.foreachPartition(part => {
part.foreach(println)
})
来源:https://stackoverflow.com/questions/66066034/scala-cannot-infer