Am trying to get the top most rating using groupby of multiple columns and if there is no combination of that particular groupby, its throwing me an error . how to do multiple c
You can use *args for dynamic input, (ordering of values cannot be changed) with query for filtering:
def get_top(*args):
c = ['maritalstatus', 'gender', 'age_range', 'occ']
m = (df.groupby(c)['rating'].apply(lambda x: x.value_counts().index[0])
.reset_index())
args = list(args)
while True:
d = dict(zip(c, args))
#https://stackoverflow.com/a/48371587/2901002
q = ' & '.join((('({} == "{}")').format(i, j)) for i, j in d.items())
m1 = m.query(q)['rating']
if m1.empty and len(args) > 1:
args.pop()
else:
return m1
print(get_top('ma', 'M', 'young','teacher'))
1 PG
Name: rating, dtype: object