computing aggregations of sparse vectors using pyspark.ml.stat.Summarizer returns dense vector results - is there a way to force sparse vector operations?
pyspark.ml.stat.Summarizer