Is there any library/framework that can split a dataset between the same job running in different instances within the same cluster? It will be much better if same library/frame