I have a data frame of genus names (~1.4 million entries), with multiple entries of each genus. Each occurrence of each genus has an assigned environment - terrestrial or ma