Alternative to Scipy mode function in Numpy?

后端 未结 2 928
遥遥无期
遥遥无期 2021-01-18 01:01

Is there another way in numpy to realize scipy.stats.mode function to get the most frequent values in ndarrays along axis?(without importing other modules) i.e.



        
相关标签:
2条回答
  • 2021-01-18 01:24

    If you know there are not many different values (relative to the size of the input "itemArray"), something like this could be efficient:

    uniqueValues = np.unique(itemArray).tolist()
    uniqueCounts = [len(np.nonzero(itemArray == uv)[0])
                    for uv in uniqueValues]
    
    modeIdx = uniqueCounts.index(max(uniqueCounts))
    mode = itemArray[modeIdx]
    
    # All counts as a map
    valueToCountMap = dict(zip(uniqueValues, uniqueCounts))
    
    0 讨论(0)
  • 2021-01-18 01:37

    The scipy.stats.mode function is defined with this code, which only relies on numpy:

    def mode(a, axis=0):
        scores = np.unique(np.ravel(a))       # get ALL unique values
        testshape = list(a.shape)
        testshape[axis] = 1
        oldmostfreq = np.zeros(testshape)
        oldcounts = np.zeros(testshape)
    
        for score in scores:
            template = (a == score)
            counts = np.expand_dims(np.sum(template, axis),axis)
            mostfrequent = np.where(counts > oldcounts, score, oldmostfreq)
            oldcounts = np.maximum(counts, oldcounts)
            oldmostfreq = mostfrequent
    
        return mostfrequent, oldcounts
    

    Source: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L609

    0 讨论(0)
提交回复
热议问题