There is this example in Spark examples directory for generating skewed data(https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/Si