I have a static lists group_1 and group_2. group_1 = [a,b,c,d,e,f,g] group_2 = [h,i,j,k] I have pyspark dataframe df1 as shown below. Example1: df1 = +-----+---------