According to Learning Spark
Keep in mind that repartitioning your data is a fairly expensive operation. Spark also has an optimized version of
In a simple way COALESCE :- is only for decreases the no of partitions , No shuffling of data it just compress the partitions
REPARTITION:- is for both increase and decrease the no of partitions , But shuffling takes place
val rdd = sc.textFile("path",7)
Both works fine
But we go generally for this two things when we need to see output in one cluster,we go with this.