I have a sequence file whose values look like
(string_value, json_value)
I don\'t care about the string value.
In Scala I can read
For Spark 2.4.x, you've to get the sparkContext object from SparkSession (spark object). Which has the sequenceFile API to read Sequence Files.
spark.
sparkContext.
sequenceFile('/user/sequencesample').
toDF().show()
Above one works like a charm.
For writing (parquet to sequenceFile):
spark.
read.
format('parquet').
load('/user/parquet_sample').
select('id',F.concat_ws('|','id','name')).
rdd.map(lambda rec:(rec[0],rec[1])).
saveAsSequenceFile('/user/output')
First convert DF to RDD and create a tuple of (Key,Value) pair before saving as SequenceFile.
I hope this answer helps your purpose.