plotting a 2D matrix in python, code and most useful visualization

后端未结

关注

 2  1022

醉梦人生 2021-01-05 08:15

I have a very large matrix(10x55678) in \"numpy\" matrix format. the rows of this matrix correspond to some \"topics\" and the columns correspond to words(unique words from

2条回答

醉梦人生 (楼主)

2021-01-05 08:51
You could certainly use matplotlib's imshowor pcolor method to display the data, but as comments have mentioned, it might be hard to interpret without zooming in on subsets of the data.
```
a = np.random.normal(0.0,0.5,size=(5000,10))**2
a = a/np.sum(a,axis=1)[:,None]  # Normalize

pcolor(a)
```
You could then sort the words by the probability that they belong to a cluster:
```
maxvi = np.argsort(a,axis=1)
ii = np.argsort(maxvi[:,-1])

pcolor(a[ii,:])
```
Here the word index on the y-axis no longer equals the original ordering since things have been sorted.

Another possibility is to use the networkx package to plot word clusters for each category, where the words with the highest probability are represented by nodes that are either larger or closer to the center of the graph and ignore those words that have no membership in the category. This might be easier since you have a large number of words and a small number of categories.

Hopefully one of these suggestions is useful.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...