There are some good articles for how to get the feature importance vector for a random forest with MLlib https://www.timlrx.com/2018/06/19/feature-selection-using-feature-im