问题
I am trying to run consensus clustering using M3C library in R. My dataset contains 451 samples and ~2500 genes. The row names are the ENTREZ IDs (numeric values) of the genes. I have crosschecked the dataset using "any(duplicated(colnames(MyData)))" command to make sure that there is no duplicate entries in the row names. I ran the following command to perform the consensus clustering using M3C library:
res <- M3C(MyData, cores=8, seed = 123, des = annotation, removeplots = TRUE, analysistype = 'chi', doanalysis = TRUE, variable = 'class')
I am getting the following error:
Warning message:
"non-unique values when setting 'row.names': "
Error in `.rowNamesDF<-`(x, value = value): duplicate 'row.names' are not allowed
Traceback:
1. M3C(MyData, cores = 8, seed = 123, des = meta, removeplots = TRUE,
. analysistype = "chi", doanalysis = TRUE, variable = "class")
2. M3Creal(as.matrix(mydata), maxK = maxK, reps = repsreal, pItem = 0.8,
. pFeature = 1, clusterAlg = clusteralg, distance = distance,
. title = "/home/christopher/Desktop/", printres = printres,
. showheatmaps = showheatmaps, printheatmaps = printheatmaps,
. des = des, x1 = pacx1, x2 = pacx2, seed = seed, removeplots = removeplots,
. silent = silent, doanalysis = doanalysis, analysistype = analysistype,
. variable = variable, fsize = fsize, method = method)
3. `row.names<-`(`*tmp*`, value = newerdes$ID)
4. `row.names<-.data.frame`(`*tmp*`, value = newerdes$ID)
5. `.rowNamesDF<-`(x, value = value)
6. stop("duplicate 'row.names' are not allowed")
Can anyone please help me to resolve the issue?
Thanks
回答1:
I ran the equivalent of the following using M3C:
df_wide_matrix # my expression matrix
any(duplicated(colnames(df_wide_matrix))) # result = FALSE
M3C::M3C(df_wide_matrix, iters=2, repsref=2, repsreal=2, clusteralg="hc", objective="PAC")
I ran into the exact same error message with M3C, in addition to:
In addition: Warning message:
non-unique values when setting 'row.names': ‘ABCDEF’, ‘ABCDGH’
I assumed the issue is caused by the fact the first four characters of each of these features are equal. I therefore temporarily changed their respective names prior to running M3C:
dup_ids <- which(colnames(dissADJ) %in% c("ABCDEF", "ABCDGH"))
colnames(dissADJ)[dup_ids] <- c("A", "B")
M3C::M3C(df_wide_matrix, iters=2, repsref=2, repsreal=2, clusteralg="hc", objective="PAC")
M3C then runs correctly. Not an ideal solution but worked for me - I've posted it as an issue: https://github.com/crj32/M3C/issues/6.
来源:https://stackoverflow.com/questions/60896731/r-m3c-library-duplicate-row-names-error-message