问题
I have some time to event data that I need to generate around 200 shape/scale parameters for subgroups for a simulation model. I have analysed the data, and it best follows a weibull distribution.
Normally, I would use the fitdistrplus package and fitdist(x, "weibull")
to do so, however this data has been matched using kernel matching and I have a variable of weighting values called km
and so needs to incorporate a weight, which isn't something fitdist
can do as far as I can tell.
With my gamma distributed data instead of using fitdist
I did the calculation manually using the wtd.mean
and wtd.var
functions from the hsmisc
package, which worked well. However, finding a similar formula for the weibull is eluding me.
I've been testing a few options and comparing them against the fitdist results:
test_data <- rweibull(100, 0.676, 946)
fitweibull <- fitdist(test_data, "weibull", method = "mle", lower = c(0,0))
fitweibull$estimate
shape scale
0.6981165 935.0907482
I first tested this: The Weibull distribution in R (ExtDist)
library(bbmle)
m1 <- mle2(y~dweibull(shape=exp(lshape),scale=exp(lscale)),
data=data.frame(y=test_data),
start=list(lshape=0,lscale=0))
which gave me lshape = -0.3919991
and lscale = 6.852033
The other thing I've tried is eweibull
from the EnvStats
package.
eweibull <- eweibull(test_data)
eweibull$parameters
shape scale
0.698091 935.239277
However, while these are giving results, I still don't think I can fit my data with the weights into any of these.
Edit: I have also tried the similarly named eWeibull
from the ExtDist
package (which I'm not 100% sure still works, but does have a weibull function that takes a weight!). I get a lot of error messages about the inputs being non-computable (NA or infinite). If I do it with map
, so map(test_data, test_km, eWeibull)
I get [[NULL] for all 100 values. If I try it just with test_data, I get a long string of errors associated with optimx.
I have also tried fitDistr
from propagate
which gives errors that weights
should be a specific length. For example, if both are set to be 100, I get an error that weights
should be length 94. If I set it to 94, it tells me it has to be length of 132.
I need to be able to pass either a set of pre-weighted mean/var/sd etc data into the calculation, or have a function that can take data and weights and use them both in the calculation.
回答1:
After much trial and error, I edited the eweibull
function from the EnvStats
package to instead of using mean(x)
and sd(x)
, to instead use wtd.mean(x,w)
and sqrt(wtd.var(x, w))
. This now runs and outputs weighted values.
来源:https://stackoverflow.com/questions/51422331/weibull-distribution-with-weighted-data