Optimizing Python distance calculation while accounting for periodic boundary conditions

前端 未结 4 2092
攒了一身酷
攒了一身酷 2020-12-30 07:24

I have written a Python script to calculate the distance between two points in 3D space while accounting for periodic boundary conditions. The problem is that I need to do t

相关标签:
4条回答
  • 2020-12-30 07:39
    import numpy as np
    
    bounds = np.array([10, 10, 10])
    a = np.array([[0, 3, 9], [1, 1, 1]])
    b = np.array([[2, 9, 1], [5, 6, 7]])
    
    min_dists = np.min(np.dstack(((a - b) % bounds, (b - a) % bounds)), axis = 2)
    dists = np.sqrt(np.sum(min_dists ** 2, axis = 1))
    

    Here a and b are lists of vectors you wish to calculate the distance between and bounds are the boundaries of the space (so here all three dimensions go from 0 to 10 and then wrap). It calculates the distances between a[0] and b[0], a[1] and b[1], and so on.

    I'm sure numpy experts could do better, but this will probably be an order of magnitude faster than what you're doing, since most of the work is now done in C.

    0 讨论(0)
  • 2020-12-30 07:41

    You should write your distance() function in a way that you can vectorise the loop over the 5711 points. The following implementation accepts an array of points as either the x0 or x1 parameter:

    def distance(x0, x1, dimensions):
        delta = numpy.abs(x0 - x1)
        delta = numpy.where(delta > 0.5 * dimensions, delta - dimensions, delta)
        return numpy.sqrt((delta ** 2).sum(axis=-1))
    

    Example:

    >>> dimensions = numpy.array([3.0, 4.0, 5.0])
    >>> points = numpy.array([[2.7, 1.5, 4.3], [1.2, 0.3, 4.2]])
    >>> distance(points, [1.5, 2.0, 2.5], dimensions)
    array([ 2.22036033,  2.42280829])
    

    The result is the array of distances between the points passed as second parameter to distance() and each point in points.

    0 讨论(0)
  • 2020-12-30 07:55

    Have a look at Ian Ozsvalds high performance python tutorial. It contains lots of suggestions on where you can go next.

    Including:

    • vectorization
    • cython
    • pypy
    • numexpr
    • pycuda
    • multiprocesing
    • parallel python
    0 讨论(0)
  • 2020-12-30 07:58

    I have found that meshgrid is very useful for generating distances. For example:

    import numpy as np
    row_diff, col_diff = np.meshgrid(range(7), range(8))
    radius_squared = (row_diff - x_coord)**2 + (col_diff - y_coord)**2
    

    I now have an array (radius_squared) where every entry specifies the square of the distance from the array position [x_coord, y_coord].

    To circularize the array, I can do the following:

    row_diff, col_diff = np.meshgrid(range(7), range(8))
    row_diff = np.abs(row_diff - x_coord)
    row_circ_idx = np.where(row_diff > row_diff.shape[1] / 2)
    row_diff[row_circ_idx] = (row_diff[row_circ_idx] - 
                             2 * (row_circ_idx + x_coord) + 
                             row_diff.shape[1])
    row_diff = np.abs(row_diff)
    col_diff = np.abs(col_diff - y_coord)
    col_circ_idx = np.where(col_diff > col_diff.shape[0] / 2)
    col_diff[row_circ_idx] = (row_diff[col_circ_idx] - 
                             2 * (col_circ_idx + y_coord) + 
                             col_diff.shape[0])
    col_diff = np.abs(row_diff)
    circular_radius_squared = (row_diff - x_coord)**2 + (col_diff - y_coord)**2
    

    I now have all the array distances circularized with vector math.

    0 讨论(0)
提交回复
热议问题