Calculate weighted pairwise distance matrix in Python

前端 未结 2 1750
孤城傲影
孤城傲影 2021-02-14 11:37

I am trying to find the fastest way to perform the following pairwise distance calculation in Python. I want to use the distances to rank a list_of_objects by their

2条回答
  •  后悔当初
    2021-02-14 12:41

    The normalization step, where you divide pairwise distances by the max value, seems non-standard, and may make it hard to find a ready-made function that will do exactly what you are after. It is pretty easy though to do it yourself. A starting point is to turn your list_of_objects into an array:

    >>> obj_arr = np.array(list_of_objects)
    >>> obj_arr.shape
    (3L, 4L)
    

    You can then get the pairwise distances using broadcasting. This is a little inefficient, because it is not taking advantage of the symettry of your metric, and is calculating every distance twice:

    >>> dists = np.abs(obj_arr - obj_arr[:, None])
    >>> dists.shape
    (3L, 3L, 4L)
    

    Normalizing is very easy to do:

    >>> dists /= dists.max(axis=(0, 1))
    

    And your final weighing can be done in a variety of ways, you may want to benchmark which is fastest:

    >>> dists.dot([1, 1, 1, 1])
    array([[ 0.        ,  1.93813131,  2.21542674],
           [ 1.93813131,  0.        ,  3.84644195],
           [ 2.21542674,  3.84644195,  0.        ]])
    >>> np.einsum('ijk,k->ij', dists, [1, 1, 1, 1])
    array([[ 0.        ,  1.93813131,  2.21542674],
           [ 1.93813131,  0.        ,  3.84644195],
           [ 2.21542674,  3.84644195,  0.        ]])
    

提交回复
热议问题