My data looks like this:
df <- read.table(header = T, text =
\"GeneID Gene_Name Species Paralogues Domains Functional_Diversity
To see just the rows that have "Duplicate identifiers", you could use...
df %>%
group_by(Gene_Name, Species) %>%
mutate(n = n()) %>%
filter(n > 1)
To ensure the spread
works, even if you have rows with duplicate identifiers, you can add a row number column which will guarantee that each row is unique...
df %>%
select(Gene_Name, Species, Functional_Diversity) %>%
mutate(row = row_number()) %>%
spread(Species, Functional_Diversity)