For python dataframe, info() function provides memory usage. Is there any equivalent in pyspark ? Thanks
I have something in mind, its just a rough estimation. as far as i know spark doesn't have a straight forward way to get dataframe memory usage, But Pandas dataframe does. so what you can do is.
sample = df.sample(fraction = 0.01)
pdf = df.toPandas()
pdf.info()