I have a Dataset with 29M rows and I am using Azure Databricks and SparkR for processing data and building a predictor model.
The issue that I have with the collect(df) c