问题
Is there a way to create/generate a Pandas DataFrame from scratch, such that each record follows a specific mathematical function?
Background: In Financial Mathematics, very basic financial-derivatives (e.g. calls and puts) have closed-form pricing formulas (e.g. Black Scholes). These pricing formulas can be called stochastic functions (because they involve a random term)
I'm trying to create a Monte Carlo simulation of a stock price (and subseuqently an option payoff and price based on the stock price). I need, say, 1000 paths (rows) and 100 time-steps (columns). I want to "initiate" a dataframe that is 1000 by 100 and follows a stochastic equation.
# Psuedo-code
MonteCarloDF = DataFrame(rows=1000, columns=100, customFunc=TRUE,
appliedBy='by column',
FUNC={s0=321;
s_i=prev*exp(r-q*sqrt(sigma))*T +
(etc)*NormDist(rnd())*sqr(deltaT)}
)
Column 0 in every row would be 321, and each subsequent column would be figured out based on the FUNC
above.
This is an example of something similar done in VBA
Function MonteCarlo_Vanilla_call(S, K, r, q, vol, T, N)
sum = 0
payoff = 0
For i = 1 To N
S_T = S * Exp((r - q - 0.5 * vol ^ 2) * T + vol * Sqr(T) * Application.NormSInv(Rnd()))
payoff = Application.Max(S_T - K, 0)
sum = sum + payoff
Next i
MonteCarlo_Vanilla_call = Exp(-r * T) * sum / N
End Function
Every passed in variable is a constant.
In my case, I want each next column in the same row to be just like S_T
in the VBA code. That's really the only like that matters. I want to apply a function like S_T = S * Exp((r - q - 0.5 * vol ^ 2) * T + vol * Sqr(T) * Application.NormSInv(Rnd()))
. Each S_T
is the next column in the same row. There's N
columns making one simulation. I will have, for example, 1000 simulations.
321 | 322.125 | 323.277 | ... | column 100 value
321 | 320.704 | 319.839 | ... | column 100 value
321 | 321.471 | 318.456 | ... | column 100 value
...
row 1000| etc | etc | ... | value (1000,100)
回答1:
IIUC, you could create your own function to generate a DataFrame
.
Within the function iterate using .iloc[:, -1]
to use the last created column.
We'll also use numpy.random.randn to generate an array of normally distributed random values.
You may need to adjust the default values of your variables, but the idea would be something like:
Function
import pandas as pd
import numpy as np
from math import exp, sqrt
def monte_carlo_df(nrows,
ncols,
col_1_val,
r=0.03,
q=0.5,
sigma=0.002,
T=1.0002,
deltaT=0.002):
"""Returns stochastic monte carlo DataFrame"""
# Create first column
df = pd.DataFrame({'s0': [col_1_val] * nrows})
# Create subsequent columns
for i in range(1, ncols):
df[f's{i}'] = (df.iloc[:, -1] * exp(r - q * sqrt(sigma)) * T
+ (np.random.randn(nrows) * sqrt(deltaT)))
return df
Usage example
df = monte_carlo_df(nrows=1000, ncols=100, col_1_val=321)
回答2:
To me your problem is a specific version of the following one: Pandas calculations based on other rows. Since you can pivot it shouldn't matter if we are talking rows or columns.
There is also a question relating to calculations using columns: Pandas complex calculation based on other columns which has a good suggestion of using a rolling window (rolling function) or using shift function: Calculate the percentage increase or decrease based on the previous column value of the same row in pandas dataframe
Speed considerations of similar calculations (or numpy
vs pandas
discussion): Numpy, Pandas: what is the fastest way to calculate dataset row value basing on previous N values?
To sum it all up - it seems that your question is somewhat of a duplicate.
来源:https://stackoverflow.com/questions/56375434/generate-a-dataframe-that-follow-a-mathematical-function-for-each-column-row