So apparently.. the means_
attribute returns different results from the means I calculated per each cluster. (or I have a wrong understanding of what this returns!)
Though I'm not completely sure of what your code is doing, I fairly sure what the problem is here.
The parameters returned by means_
are the means of the parametric (Gaussian) distributions that make up the model. Where as when you are calculating the means you are doing it by taking the average of all data that is clustered in each component, this will almost always give different (though similar results). To get a better understanding of why these might differ I would suggest reading a bit more about the Expectation maximization algorithm that scikit-learn
uses to fit GMM's.