问题
Purpose
I have turned a double for loop
into a single for loop
using vectorization
. I would like to now get rid of the last loop
.
I want to slice
an Nx3 array
of coordinates and calculate distances between the sliced portion and the remaining portion without using a for loop.
Two cases
(1) the slice is always 3x3
.
(2) the slice is variable i.e., Mx3
where M is always significantly smaller than N
Vectorizing the interaction of 1 row of the slice interacting with the remainder is straightforward. However, I am stuck using a for loop to do (in the case of the slice of size 3) 3 loops, to calculate all distances.
Context:
The Nx3 array is atom coordinates, the slice is all atoms in a specific molecule. I want to calculate the energy of a given molecule interacting with the rest of the system. The first step is calculating the distances between each atom in the molecule, with all other atoms. The second part is to use those distances in a function to calculate energy, and that is outside the scope of this question.
Here is what I have for a working minimal example (I have vectorized
the inner loop, but, need to (would really like to...) vectorize
the outer loop
. That loop won't always be of only size 3, and python
is slow at for loops.
Minimal Working Example
import numpy as np
box=10 # simulation box is size 10 for this example
r = np.random.rand(1000,3) * box # avoids huge numbers later by scaling coords
start=0 #fixed starting index for example (first atom)
end=2 #fixed ending index for example (last atom)
rj=np.delete(r, np.arange(start,end), 0)
ri = r[np.arange(start,end),:]
atoms_in_molecule, coords = np.shape(ri)
energy = 0
for a in range(atoms_in_molecule):
rij = ri[a,:] - rj # I want to get rid of this 'a' index dependance
rij = rij - np.rint(rij/box)*box # periodic boundary conditions - necessary
rij_sq = np.sum(rij**2,axis=1)
# perform energy calculation using rij_sq
ener = 4 * ((1/rij_sq)**12 - (1/rij_sq)**6) # dummy LJ, do not optimize
energy += np.sum(ener)
print(energy)
This question is not about optimizing the vectorizing I already have. I have played around with pdist/cdist and others. All I want is to get rid of the pesky for loop over atoms. I will optimize the rest.
回答1:
Here how you can do it:
R = ri[:,None] - rj[None, :]
R = R - np.rint(R/box)*box
R_sq = np.sum(np.square(R), axis=2)
energy = np.sum(4 * ((1/R_sq)**12 - (1/R_sq)**6))
来源:https://stackoverflow.com/questions/60906051/broadcasting-vectorizing-inner-and-outer-for-loops-in-python-numpy