Implementing a linear regression using gradient descent

问题

I'm trying to implement a linear regression with gradient descent as explained in this article (https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931). I've followed to the letter the implementation, yet my results overflow after a few iterations. I'm trying to get this result approximately: y = -0.02x + 8499.6.

The code:

package main

import (
    "encoding/csv"
    "fmt"
    "strconv"
    "strings"
)

const (
    iterations = 1000
    learningRate = 0.0001
)

func computePrice(m, x, c float64) float64 {
    return m * x + c
}

func computeThetas(data [][]float64, m, c float64) (float64, float64) {
    N := float64(len(data))
    dm, dc := 0.0, 0.0
    for _, dataField := range data {
        x := dataField[0]
        y := dataField[1]
        yPred := computePrice(m, x, c)
        dm += (y - yPred) * x
        dc += y - yPred
    }

    dm *= -2/N
    dc *= -2/N
    return m - learningRate * dm, c - learningRate * dc
}

func main() {
    data := readXY()
    m, c := 0.0, 0.0
    for k := 0; k < iterations; k++ {
        m, c = computeThetas(data, m, c)
    }
    fmt.Printf("%.4fx + %.4f\n", m, c)
}

func readXY() ([][]float64) {
    file := strings.NewReader(data)

    reader := csv.NewReader(file)
    records, err := reader.ReadAll()
    if err != nil {
        panic(err)
    }
    records = records[1:]
    size := len(records)
    data := make([][]float64, size)
    for i, v := range records {
        val1, err := strconv.ParseFloat(v[0], 64)
        if err != nil {
            panic(err)
        }
        val2, err := strconv.ParseFloat(v[1], 64)
        if err != nil {
            panic(err)
        }
        data[i] = []float64{val1, val2}
    }
    return data
}

var data = `km,price
240000,3650
139800,3800
150500,4400
185530,4450
176000,5250
114800,5350
166800,5800
89000,5990
144500,5999
84000,6200
82029,6390
63060,6390
74000,6600
97500,6800
67000,6800
76025,6900
48235,6900
93000,6990
60949,7490
65674,7555
54000,7990
68500,7990
22899,7990
61789,8290`

And here it can be worked on in the GO playground: https://play.golang.org/p/2CdNbk9_WeY

What do I need to fix to get the correct result ?

回答1:

Why would a formula work on one data set and not another one?

In addition to sascha's remarks, here's another way to look at problems of this application of gradient descent: The algorithm offers no guarantee that an iteration yields a better result than the previous, so it doesn't necessarily converge to a result, because:

The gradients dm and dc in axes m and c are handled indepently from each other; m is updated in the descending direction according to dm, and c at the same time is updated in the descending direction according to dc — but, with certain curved surfaces z = f(m, c), the gradient in a direction between axes m and c can have the opposite sign compared to m and c on their own, so, while updating any one of m or c would converge, updating both moves away from the optimum.
However, more likely the failure reason in this case of linear regression to a point cloud is the entirely arbitrary magnitude of the update to m and c, determined by the product of an obscure learning rate and the gradient. It is quite possible that such an update oversteps a minimum for the target function, even that this is repeated with higher amplitude in each iteration.

来源：https://stackoverflow.com/questions/58996556/implementing-a-linear-regression-using-gradient-descent

标签

machine-learning

linear-regression

gradient-descent