I am trying findout the rows in the spark internal table , where the name column has duplicate data in it . For example consider the following data set , w