Simplest way to find the element that occurs the most in each column

后端 未结 4 1201
走了就别回头了
走了就别回头了 2021-01-24 17:51

Suppose I have

data =
[[a, a, c],
 [b, c, c],
 [c, b, b],
 [b, a, c]]

I want to get a list containing the element that occurs the most in each

4条回答
  •  一生所求
    2021-01-24 18:35

    In statistics, what you want is called the mode. The scipy library (http://www.scipy.org/) has a mode function, in scipy.stats.

    In [32]: import numpy as np
    
    In [33]: from scipy.stats import mode
    
    In [34]: data = np.random.randint(1,6, size=(6,8))
    
    In [35]: data
    Out[35]: 
    array([[2, 1, 5, 5, 3, 3, 1, 4],
           [5, 3, 2, 2, 5, 2, 5, 3],
           [2, 2, 5, 3, 3, 2, 1, 1],
           [2, 4, 1, 5, 4, 4, 4, 5],
           [4, 4, 5, 5, 2, 4, 4, 4],
           [2, 4, 1, 1, 3, 3, 1, 3]])
    
    In [36]: val, count = mode(data, axis=0)
    
    In [37]: val
    Out[37]: array([[ 2.,  4.,  5.,  5.,  3.,  2.,  1.,  3.]])
    
    In [38]: count
    Out[38]: array([[ 4.,  3.,  3.,  3.,  3.,  2.,  3.,  2.]])
    

提交回复
热议问题