I am using Spark to read a bunch of files, elaborating on them and then saving all of them as a Sequence file. What I wanted, was to have 1 sequence file per partition, so I
You can serialize and deserialize the org.apache.hadoop.conf.Configuration
using org.apache.spark.SerializableWritable
.
For example:
import org.apache.spark.SerializableWritable
...
val hadoopConf = spark.sparkContext.hadoopConfiguration
// serialize here
val serializedConf = new SerializableWritable(hadoopConf)
// then access the conf by calling .value on serializedConf
rdd.map(someFunction(serializedConf.value))