问题
I have a dataframe where the first row is the initial condition.
df = pd.DataFrame({"Year": np.arange(4),
"Pop": [0.4] + [np.nan]* 3})
and a function f(x,r) = r*x*(1-x)
, where r = 2
is a constant and 0 <= x <= 1
.
I want to produce the following dataframe by applying the function to column Pop
row-by-row iteratively. I.e., df.Pop[i] = f(df.Pop[i-1], r=2)
df = pd.DataFrame({"Year": np.arange(4),
"Pop": [0.4, 0.48, 4992, 0.49999872]})
Question: Is it possible to do this in a vectorized way?
I can achieve the desired result by using a loop to build lists for the x and y values, but this is not vectorized.
I have also tried this, but all nan
places are filled with 0.48
.
df.loc[1:, "Pop"] = R * df.Pop[:-1] * (1 - df.Pop[:-1])
回答1:
It is IMPOSSIBLE to do this in a vectorized way.
By definition, vectorization makes use of parallel processing to reduce execution time. But the desired values in your question must be computed in sequential order, not in parallel. See this answer for detailed explanation. Things like df.expanding(2).apply(f) and df.rolling(2).apply(f) won't work.
However, gaining more efficiency is possible. You can do the iteration using a generator. This is a very common construct for implementing iterative processes.
def gen(x_init, n, R=2):
x = x_init
for _ in range(n):
x = R * x * (1-x)
yield x
# execute
df.loc[1:, "Pop"] = list(gen(df.at[0, "Pop"], len(df) - 1))
Result:
print(df)
Pop
0 0.400000
1 0.480000
2 0.499200
3 0.499999
It is completely OK to stop here for small-sized data. If the function is going to be performed a lot of times, however, you can consider optimizing the generator with numba.
pip install numba
orconda install numba
in the console firstimport numba
- Add decorator
@numba.njit
in front of the generator.
Change the number of np.nan
s to 10^6 and check out the difference in execution time yourself. An improvement from 468ms to 217ms was achieved on my Core-i5 8250U 64bit laptop.
来源:https://stackoverflow.com/questions/64515499/vectorizing-an-iterative-function-on-pandas-dataframe