I have a dataframe that looks like this (but has in reality > 10000 rows):
data = {\'age\': [54, 21, 7, 18], \'sex\': [0, 1, 1, 0], \'disea