Apologies for the terrible title. However, I had trouble making this particular question a concise title.
I have a data frame like so (Note: it is over 50000 rows long
If you can use dplyr
and don't need consecutive (though unique) numbering:
df %>%
group_by(lepfam) %>%
mutate(lep_species=ifelse(!is.na(lep_species), lep_species,
paste0(lepfam, "_morphosp", rank(lep_notes, ties.method ="min"))))
lepfam lep_notes lep_species
<chr> <chr> <chr>
1 Geometridae <NA> Eois sp
2 Erebidae black/orange Erebidae_morphosp2
3 Erebidae black spikes Erebidae_morphosp1
4 Erebidae redthorax/red legs Erebidae_morphosp3
5 Noctuidae fuzzy/green Noctuidae_morphosp2
6 Noctuidae black hair/greenbody Noctuidae_morphosp1
7 Noctuidae fuzzy/green Noctuidae_morphosp2
8 Saturnidae <NA> Polyphemous sp
Or with consecutive numbers:
df %>%
group_by(lepfam) %>%
mutate(lep_species=ifelse(!is.na(lep_species), lep_species,
paste0(lepfam, "_morphosp", match(lep_notes,unique(lep_notes)))))
lepfam lep_notes lep_species
<chr> <chr> <chr>
1 Geometridae <NA> Eois sp
2 Erebidae black/orange Erebidae_morphosp1
3 Erebidae black spikes Erebidae_morphosp2
4 Erebidae redthorax/red legs Erebidae_morphosp3
5 Noctuidae fuzzy/green Noctuidae_morphosp1
6 Noctuidae black hair/greenbody Noctuidae_morphosp2
7 Noctuidae fuzzy/green Noctuidae_morphosp1
8 Saturnidae <NA> Polyphemous sp