I am using java Spark and using clustering algorithm for cluster document.
Suppose I have below documents :
this is a simple sentence this is a importa