I am using Spark\'s Python API and running Spark 0.8.
I am storing a large RDD of floating point vectors and I need to perform calculations of one vector against the ent
You can do partition as follows:
import org.apache.spark.Partitioner val p = new Partitioner() { def numPartitions = 2 def getPartition(key: Any) = key.asInstanceOf[Int] } recordRDD.partitionBy(p)