I\'m looking to plan the capacity for Apache Spark in a server. What is the relationship between disk capacity and processed data size?