Data type problem using scipy.spatial

前端 未结 1 1805
不思量自难忘°
不思量自难忘° 2021-02-06 10:31

I want to use scipy.spatial\'s KDTree to find nearest neighbor pairs in a two dimensional array (essentially a list of lists where the dimension of the nested list is 2). I gene

1条回答
  •  春和景丽
    2021-02-06 11:22

    I have used scipy.spatial before, and it appears to be a nice improvement (especially wrt the interface) as compared to scikits.ann.

    In this case I think you have confused the return from your tree.query(...) call. From the scipy.spatial.KDTree.query docs:

    Returns
    -------
    
    d : array of floats
        The distances to the nearest neighbors.
        If x has shape tuple+(self.m,), then d has shape tuple if
        k is one, or tuple+(k,) if k is larger than one.  Missing
        neighbors are indicated with infinite distances.  If k is None,
        then d is an object array of shape tuple, containing lists
        of distances. In either case the hits are sorted by distance
        (nearest first).
    i : array of integers
        The locations of the neighbors in self.data. i is the same
        shape as d.
    

    So in this case when you query for the nearest to [1,1] you are getting:

    distance to nearest: 0.0
    index of nearest in original array: 0
    

    This means that [1,1] is the first row of your original data in array, which is expected given your data is y = x on the range [1,50].

    The scipy.spatial.KDTree.query function has lots of other options, so if for example you wanted to make sure to get the nearest neighbour that isn't itself try:

    tree.query([1,1], k=2)
    

    This will return the two nearest neighbours, which you could apply further logic to such that cases where the distance returned is zero (i.e. the point queried is one of data items used to build the tree) the second nearest neighbour is taken rather than the first.

    0 讨论(0)
提交回复
热议问题