I have a dataset when parsed into a dataframe looks similar to
document_id tax_0 tax_1 tax_2 tax_3 tax_4
where document_id(int) is distinct. do