问题
I have a dictionary of matrices.
The dictionary is called dict.
dict[location]
returns a square n x n correlation dataframe for that location.
locations
is the list of all locations. (Keys in the dictionary).
I want to essentially make a list of every i,j component in a dataframe across keys and take the median of all of those. You can think of this as stacking the matrices on top of each other and taking the median value for each i,j element. I hope I explained this clearly enough.
I was wondering if there is a clever way to do this. I would like to avoid making the list of n(n+1)/2 unique i,jth pairs and then taking the medians, then putting them back in their proper place in the final median matrix (dataframe).
回答1:
This appears to work well and efficiently.
numpy.median(dict.values(),axis=0)
In general, the median requires all of the data in memory, unless you only want an estimate. Therefore, for a large amount of data, you'll have to work in chunks:
numpy.median( [m[0:10,0:10], for m in dict.values()], axis=0)
来源:https://stackoverflow.com/questions/26469470/element-wise-median-of-a-lot-of-matrices-python-pandas