Background:
We are moving from hadoop to spark, and one of our mapreduce job was essentially merging multiple sorted files (sorted on same primary key) into single file,