问题
I have a dataframe of more than 5000 observations. In my attempt of analysing my data using hierarchical clustering, I have 8 clusters, where some rows contain either a few 1000 or 100 observations.
# Cut tree into 8 groups
cutree_hclust <- cutree(hclust.unsupervised, k = 8)
# Number of members in each cluster
table(cutree_hclust)
cutree_hclust
1 2 3 4 5 6 7 8
486 61 14 3 15 2 9 5
To get a view of what variable combination there is for each observation in the different clusters, I thought that it might be an idea to make the 8 clusters as dataframes, so I can analyse them separately. This because I have not idea what different rows are in the different columns and therefore don't know what the pattern in the overall datafram (Complete_df) is.
However, how can I make these new dataframes?
I can see what rows are in the different clusters by, fx:
rownames(MY_df)[cutree_hclust == 7]
[1] "65" "21" "21" "70" "101" "104" "112" "673"
[9] "651"
But if I type
h_clust <- as.dataframe( rownames(MY_df)[cutree_hclust == 7])
I only get a view (as a list) of what rows are in this cluster and all the other columns are not included.
But how can I make this into a dataframe without have to type the row/column sequence with square brackets 5000 times?
来源:https://stackoverflow.com/questions/50190504/r-how-to-select-several-rows-to-make-a-new-dataframe