val df = sc.parallelize(Seq((1,\"Emailab\"), (2,\"Phoneab\"), (3, \"Faxab\"),(4,\"Mail\"),(5,\"Other\"),(6,\"MSL12\"),(7,\"MSL\"),(8,\"HCP\"),(9,\"HCP12\"))).toDF(\"c1\"
This works too. Concise and very similar to SQL.
df.filter("c2 not like 'MSL%' and c2 not like 'HCP%'").show +---+-------+ | c1| c2| +---+-------+ | 1|Emailab| | 2|Phoneab| | 3| Faxab| | 4| Mail| | 5| Other| +---+-------+