we have a different resultset sizes depending on directJoin use on our spark cassandra cluster:
//newpos is a dataframe loaded from cassandra
val with_