I have a large (100K by 30K) and (very) sparse dataset in svmlight format which I load as follows:
import numpy as np from scipy.cluster.vq import kmeans2 f
In scikit-learn there is a sklearn.metrics.euclidean_distances function that works both for sparse matrices and dense numpy arrays. See the reference documentation.
scikit-learn
sklearn.metrics.euclidean_distances
However non-euclidean distances are not yet implemented for sparse matrices.