Using spread with duplicate identifiers for rows giving error

前端 未结 1 719
我寻月下人不归
我寻月下人不归 2021-01-25 18:46

My data looks like this:

df <- read.table(header = T, text =
        \"GeneID    Gene_Name   Species    Paralogues    Domains   Functional_Diversity
                  


        
相关标签:
1条回答
  • 2021-01-25 19:22

    To see just the rows that have "Duplicate identifiers", you could use...

    df %>% 
      group_by(Gene_Name, Species) %>% 
      mutate(n = n()) %>% 
      filter(n > 1)
    

    To ensure the spread works, even if you have rows with duplicate identifiers, you can add a row number column which will guarantee that each row is unique...

    df %>% 
      select(Gene_Name, Species, Functional_Diversity) %>% 
      mutate(row = row_number()) %>% 
      spread(Species, Functional_Diversity)
    
    0 讨论(0)
提交回复
热议问题