Finding the correspondence of data from one data set in the other

问题

I have a catalogue of data and I want to use it in my MCMC code. What is crucial is the speed of implementation, in order to avoid slowing down my Markov chain monte carlo sampling. The problem: In the catalogue, I have in the first and second column two parameters called ra and dec which are sky coordinates:

data=np.loadtxt('Final.Cluster.Shear.NegligibleShotNoise.Redshift.cat')
ra=data[:,0]
dec=data[:,1]

then in the seven and eight columns X and Y positions, i.e. the grid coordinates, they are points in a grid space

Xpos=data[:,6]
Ypos=data[:,7]

In the function that I have written and it is needed to be called like a million time, I will give one Xcenter and Ycenter positions (for example Xcenter=200.6, Ycenter=310.9) as inputs to the function and I want to find the correspondence points in the ra and dec columns. However it might happen that the inputs do not have any real correspondence in the ra and dec. So I want to do an interpolation in case there is no similar entries for X and Y and ra and dec data in the catalogue and obtain the interpolated coordinates based on real ra and dec entries in the catalogue.

回答1:

This is a perfect case where the scipy.spatial.cKDTree() class can be used to query all the points at once:

from scipy.spatial import cKDTree

k = cKDTree(data[:, 6:8]) # creating the KDtree using the Xpos and Ypos

xyCenters = np.array([[200.6, 310.9],
                      [300, 300],
                      [400, 400]])
print(k.query(xyCenters))
# (array([ 1.59740195,  1.56033234,  0.56352196]),
#  array([ 2662, 22789,  5932]))

where [ 2662, 22789, 5932] are the indices corresponding to the three closest points given in xyCenters. You can use these indices to get your ra and dec values very efficiently using np.take():

dists, indices = k.query(xyCenters)
myra = np.take(ra, indices)
mydec = np.take(dec, indices)

来源：https://stackoverflow.com/questions/25550813/finding-the-correspondence-of-data-from-one-data-set-in-the-other

标签

python

arrays

numpy

scipy

kdtree