Fast loan rate calculation for a big number of loans

旧巷老猫 提交于 2019-12-10 09:47:29

问题


I have a big data set (around 200k rows) where each row is a loan. I have the loan amount, the number of payments, and the loan payment. I'm trying to get the loan rate. R doesn't have a function for calculating this (at least base R doesn't have it, and I couldn't find it). It isn't that hard to write both a npv and irr functions

Npv <- function(i, cf, t=seq(from=0,by=1,along.with=cf)) sum(cf/(1+i)^t)
Irr <- function(cf) { uniroot(npv, c(0,100000), cf=cf)$root }

And you can just do

rate = Irr(c(amt,rep(pmt,times=n)))

The problem is when you try to calculate the rate for a lot of payments. Because uniroot is not vectorized, and because rep takes a surprising amount of time, you end up with a slow calculation. You can make it faster if you do some math and figure out that you are looking for the roots of the following equation

zerome <- function(r) amt/pmt-(1-1/(1+r)^n)/r

and then use that as input for uniroot. This, in my pc, takes around 20 seconds to run for my 200k database.

The problem is that I'm trying to do some optimization, and this is a step of the optimization, so I'm trying to speed it up even more.

I've tried vectorization, but because uniroot is not vectorized, I can't go further that way. Is there any root finding method that is vectorized?

Thanks


回答1:


Instead of using a root finder, you could use a linear interpolator. You will have to create one interpolator for each value of n (the number of remaining payments). Each interpolator will map (1-1/(1+r)^n)/r to r. Of course you will have to build a grid fine enough so it will return r to an acceptable precision level. The nice thing with this approach is that linear interpolators are fast and vectorized: you can find the rates for all loans with the same number of remaining payments (n) in a single call to the corresponding interpolator.

Now some code that proves it is a viable solution:

First, we create interpolators, one for each possible value of n:

n.max <- 360L  # 30 years

one.interpolator <- function(n) {
    r <- seq(from = 0.0001, to = 0.1500, by = 0.0001)
    y <- (1-1/(1+r)^n)/r
    approxfun(y, r)
}

interpolators <- lapply(seq_len(n.max), one.interpolator)

Note that I used a precision of 1/100 of a percent (1bp).

Then we create some fake data:

n.loans <- 200000L
n     <- sample(n.max, n.loans, replace = TRUE)
amt   <- 1000 * sample(100:500, n.loans, replace = TRUE)
pmt   <- amt / (n * (1 - runif(n.loans)))
loans <- data.frame(n, amt, pmt)

Finally, we solve for r:

library(plyr)
system.time(ddply(loans, "n", transform, r = interpolators[[n[1]]](amt / pmt)))
#    user  system elapsed 
#   2.684   0.423   3.084

It's fast. Note that some of the output rates are NA but it is because my random inputs made no sense and would have returned rates outside of the [0 ~ 15%] grid I selected. Your real data won't have that problem.



来源:https://stackoverflow.com/questions/13868990/fast-loan-rate-calculation-for-a-big-number-of-loans

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!