I am trying to parallelize a Monte Carlo simulation on spark. The input to the simulation is a partition of a data frame and the simulation is currently run in a user define