Trend lines ( regression, curve fitting) java library

前端 未结 3 578
忘了有多久
忘了有多久 2021-01-31 12:09

I\'m trying to develop an application that would compute the same trend lines that excel does, but for larger datasets.

3条回答
  •  再見小時候
    2021-01-31 12:09

    Since they're all based on linear fits, OLSMultipleLinearRegression is all you need for linear, polynomial, exponential, logarithmic, and power trend lines.

    Your question gave me an excuse to download and play with the commons math regression tools, and I put together some trend line tools:

    An interface:

    public interface TrendLine {
        public void setValues(double[] y, double[] x); // y ~ f(x)
        public double predict(double x); // get a predicted y for a given x
    }
    

    An abstract class for regression-based trendlines:

    public abstract class OLSTrendLine implements TrendLine {
    
        RealMatrix coef = null; // will hold prediction coefs once we get values
    
        protected abstract double[] xVector(double x); // create vector of values from x
        protected abstract boolean logY(); // set true to predict log of y (note: y must be positive)
    
        @Override
        public void setValues(double[] y, double[] x) {
            if (x.length != y.length) {
                throw new IllegalArgumentException(String.format("The numbers of y and x values must be equal (%d != %d)",y.length,x.length));
            }
            double[][] xData = new double[x.length][]; 
            for (int i = 0; i < x.length; i++) {
                // the implementation determines how to produce a vector of predictors from a single x
                xData[i] = xVector(x[i]);
            }
            if(logY()) { // in some models we are predicting ln y, so we replace each y with ln y
                y = Arrays.copyOf(y, y.length); // user might not be finished with the array we were given
                for (int i = 0; i < x.length; i++) {
                    y[i] = Math.log(y[i]);
                }
            }
            OLSMultipleLinearRegression ols = new OLSMultipleLinearRegression();
            ols.setNoIntercept(true); // let the implementation include a constant in xVector if desired
            ols.newSampleData(y, xData); // provide the data to the model
            coef = MatrixUtils.createColumnRealMatrix(ols.estimateRegressionParameters()); // get our coefs
        }
    
        @Override
        public double predict(double x) {
            double yhat = coef.preMultiply(xVector(x))[0]; // apply coefs to xVector
            if (logY()) yhat = (Math.exp(yhat)); // if we predicted ln y, we still need to get y
            return yhat;
        }
    }
    

    An implementation for polynomial or linear models:

    (For linear models, just set the degree to 1 when calling the constructor.)

    public class PolyTrendLine extends OLSTrendLine {
        final int degree;
        public PolyTrendLine(int degree) {
            if (degree < 0) throw new IllegalArgumentException("The degree of the polynomial must not be negative");
            this.degree = degree;
        }
        protected double[] xVector(double x) { // {1, x, x*x, x*x*x, ...}
            double[] poly = new double[degree+1];
            double xi=1;
            for(int i=0; i<=degree; i++) {
                poly[i]=xi;
                xi*=x;
            }
            return poly;
        }
        @Override
        protected boolean logY() {return false;}
    }
    

    Exponential and power models are even easier:

    (note: we're predicting log y now -- that's important. Both of these are only suitable for positive y)

    public class ExpTrendLine extends OLSTrendLine {
        @Override
        protected double[] xVector(double x) {
            return new double[]{1,x};
        }
    
        @Override
        protected boolean logY() {return true;}
    }
    

    and

    public class PowerTrendLine extends OLSTrendLine {
        @Override
        protected double[] xVector(double x) {
            return new double[]{1,Math.log(x)};
        }
    
        @Override
        protected boolean logY() {return true;}
    
    }
    

    And a log model:

    (Which takes the log of x but predicts y, not ln y)

    public class LogTrendLine extends OLSTrendLine {
        @Override
        protected double[] xVector(double x) {
            return new double[]{1,Math.log(x)};
        }
    
        @Override
        protected boolean logY() {return false;}
    }
    

    And you can use it like this:

    public static void main(String[] args) {
        TrendLine t = new PolyTrendLine(2);
        Random rand = new Random();
        double[] x = new double[1000*1000];
        double[] err = new double[x.length];
        double[] y = new double[x.length];
        for (int i=0; i

    Since you just wanted trend lines, I dismissed the ols models when I was done with them, but you might want to keep some data on goodness of fit, etc.

    For implementations using moving average, moving median, etc, it looks like you can stick with commons math. Try DescriptiveStatistics and specify a window. You might want to do some smoothing, using interpolation as suggested in another answer.

提交回复
热议问题