How can I obtain segmented linear regressions with a priori breakpoints?

好久不见. 提交于 2019-12-04 18:02:19

The short answer is that I solved my problem using R to create a linear regression model, and then used the segmented package to generate the piecewise linear regression from the linear model. I was able to specify the expected number of breakpoints (or knots) n as shown below using psi=NA and K=n.

The long answer is:

R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

# example data:
bullard <- structure(list(Rt = c(5.1861, 10.5266, 11.6688, 19.2345, 59.2882, 
68.6889, 320.6442, 340.4545, 479.3034, 482.6092, 484.048, 485.7009, 
486.4204, 488.1337, 489.5725, 491.2254, 492.3676, 493.2297, 494.3719, 
495.2339, 496.3762, 499.6819, 500.253, 501.1151, 504.5417, 505.4038, 
507.6278, 508.4899, 509.6321, 522.1321, 524.4165, 527.0027, 529.2871, 
531.8733, 533.0155, 544.6534, 547.9592, 551.4075, 553.0604, 556.9397, 
558.5926, 561.1788, 562.321, 563.1831, 563.7542, 565.0473, 566.1895, 
572.801, 573.9432, 575.6674, 576.2385, 577.1006, 586.2382, 587.5313, 
589.2446, 590.1067, 593.4125, 594.5547, 595.8478, 596.99, 598.7141, 
599.8563, 600.2873, 603.1429, 604.0049, 604.576, 605.8691, 607.0113, 
610.0286, 614.0263, 617.3321, 624.7564, 626.4805, 628.1334, 630.9889, 
631.851, 636.4198, 638.0727, 638.5038, 639.646, 644.8184, 647.1028, 
647.9649, 649.1071, 649.5381, 650.6803, 651.5424, 652.6846, 654.3375, 
656.0508, 658.2059, 659.9193, 661.2124, 662.3546, 664.0787, 664.6498, 
665.9429, 682.4782, 731.3561, 734.6619, 778.1154, 787.2919, 803.9261, 
814.335, 848.1552, 898.2568, 912.6188, 924.6932, 940.9083), Tem = c(12.7813, 
12.9341, 12.9163, 14.6367, 15.6235, 15.9454, 27.7281, 28.4951, 
34.7237, 34.8028, 34.8841, 34.9175, 34.9618, 35.087, 35.1581, 
35.204, 35.2824, 35.3751, 35.4615, 35.5567, 35.6494, 35.7464, 
35.8007, 35.8951, 36.2097, 36.3225, 36.4435, 36.5458, 36.6758, 
38.5766, 38.8014, 39.1435, 39.3543, 39.6769, 39.786, 41.0773, 
41.155, 41.4648, 41.5047, 41.8333, 41.8819, 42.111, 42.1904, 
42.2751, 42.3316, 42.4573, 42.5571, 42.7591, 42.8758, 43.0994, 
43.1605, 43.2751, 44.3113, 44.502, 44.704, 44.8372, 44.9648, 
45.104, 45.3173, 45.4562, 45.7358, 45.8809, 45.9543, 46.3093, 
46.4571, 46.5263, 46.7352, 46.8716, 47.3605, 47.8788, 48.0124, 
48.9564, 49.2635, 49.3216, 49.6884, 49.8318, 50.3981, 50.4609, 
50.5309, 50.6636, 51.4257, 51.6715, 51.7854, 51.9082, 51.9701, 
52.0924, 52.2088, 52.3334, 52.3839, 52.5518, 52.844, 53.0192, 
53.1816, 53.2734, 53.5312, 53.5609, 53.6907, 55.2449, 57.8091, 
57.8523, 59.6843, 60.0675, 60.8166, 61.3004, 63.2003, 66.456, 
67.4, 68.2014, 69.3065)), .Names = c("Rt", "Tem"), class = "data.frame", row.names = c(NA, 
-109L))


library(segmented)  # Version: segmented_0.2-9.4

# create a linear model
out.lm <- lm(Tem ~ Rt, data = bullard)

# Set X breakpoints: Set psi=NA and K=n:
o <- segmented(out.lm, seg.Z=~Rt, psi=NA, control=seg.control(display=FALSE, K=3))
slope(o)  # defaults to confidence level of 0.95 (conf.level=0.95)

# Trickery for placing text labels
r <- o$rangeZ[, 1]
est.psi <- o$psi[, 2]
v <- sort(c(r, est.psi))
xCoord <- rowMeans(cbind(v[-length(v)], v[-1]))
Z <- o$model[, o$nameUV$Z]
id <- sapply(xCoord, function(x) which.min(abs(x - Z)))
yCoord <- broken.line(o)[id]

# create the segmented plot, add linear regression for comparison, and text labels
plot(o, lwd=2, col=2:6, main="Segmented regression", res=TRUE)
abline(out.lm, col="red", lwd=1, lty=2)  # dashed line for linear regression
text(xCoord, yCoord, 
    labels=formatC(slope(o)[[1]][, 1] * 1000, digits=1, format="f"), 
    pos = 4, cex = 1.3)

HerrKaputt

What you want is technically called spline interpolation, particularly order-1 spline interpolation (which would join straight line segments; order-2 joins parabolas, etc).

There is already a question here on Stack Overflow dealing with Spline Interpolation in Python, which will help you on your question. Here's the link. Post back if you have further questions after trying those tips.

A very simple method ( not iterative, without initial guess, no bound to specify) is provided pages 30-31 in the paper : https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf . The result is :

NOTE : The method is based on the fitting of an integral equation. The present exemple is not a favourable case because the distribution of the abscisses of the points is far to be regular (no points in large ranges). This makes the numerical integration less accurate. Nevertheless, the piecewise fitting is surprisingly not bad.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!