I read up on the documentation of HashPartitioner. Unfortunately nothing much was explained except for the API calls. I am under the assumption that HashPartitioner
The HashPartitioner.getPartition
method takes a key as its argument and returns the index of the partition which the key belongs to. The partitioner has to know what the valid indices are, so it returns numbers in the right range. The number of partitions is specified through the numPartitions
constructor argument.
The implementation returns roughly key.hashCode() % numPartitions
. See Partitioner.scala for more details.