Following code is from the quick start guide of Apache Spark. Can somebody explain me what is the \"line\" variable and where it comes from?
textFile.map(line =
Map
and reduce
are methods of RDD class, which has interface similar to scala collections.
What you pass to methods map
and reduce
are actually anonymous function (with one param in map, and with two parameters in reduce). textFile
calls provided function for every element (line of text in this context) it has.
Maybe you should read some scala collection introduction first.
You can read more about RDD class API here: https://spark.apache.org/docs/1.2.1/api/scala/#org.apache.spark.rdd.RDD