导言
逻辑回归是机器学习中最基础也最常用的模型
一句话介绍LR:
逻辑回归假设数据服从伯努利分布
,通过极大化似然函数
的方法,运用梯度下降
来求解参数,来达到将数据二分类
的目的。
LR具体的推导可以参考李航老师的统计学习方法
为什么LR损失函数不选MSE,而采用极大似然估计(即交叉熵损失)
- 参考:https://www.cnblogs.com/smartwhite/p/9109815.html
- 对于线性回归,我们会选择MSE, 因为其J(θ)是凸函数
- 但是对于logistic回归,由于进行了sigmoid非线性映射就是
非凸函数
,所以可能在寻优的时候容易陷入局部最优
- 所以
考虑把sigmoid作log
,对其求二阶导,结果大于0,说明其是凸函数,在用梯度下降法寻优时,可以保证找到全局最小。
TensorFlow实现
使用mnist数据集验证模型的有效性
import tensorflow as tf
import os
from tensorflow.examples.tutorials.mnist import input_data
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
lr = 0.001
n_epoch = 25
batch_size = 64
def LR():
x = tf.placeholder(tf.float32, [None, 784])
# 这里得用float,后面计算损失时才能乘
y = tf.placeholder(tf.float32, [None, 10])
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.constant(0.1,[10]))
pred = tf.nn.softmax(tf.matmul(x, w) + b)
cost = tf.reduce_mean(-tf.reduce_sum(tf.multiply(y,tf.log(pred)),axis=0))
pred_res = tf.argmax(pred, axis=-1)
result = tf.equal(pred_res, tf.argmax(y, axis=-1))
acc = tf.reduce_mean(tf.cast(result, tf.float32))
opt = tf.train.AdamOptimizer(lr).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epoch):
loss = 0
total_batch_num = int(mnist.train.num_examples / batch_size)
for i in range(total_batch_num):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
_, cost_now = sess.run([opt, cost], feed_dict={x: batch_xs, y: batch_ys})
loss += cost_now
loss /= total_batch_num
if epoch % 5 == 0:
feed_train = {x: mnist.train.images, y: mnist.train.labels}
feeds_test = {x: mnist.test.images, y: mnist.test.labels}
train_acc = sess.run(acc, feed_train)
test_acc = sess.run(acc, feeds_test)
print("epoch:{}, cost = {}, train acc: {}, test acc: {},".format(epoch + 1, loss, train_acc, test_acc))
if __name__ == "__main__":
mnist = input_data.read_data_sets("/home/syd/syz/my_try/minst_data", one_hot=True)
train_img = mnist.train.images
train_label = mnist.train.labels
print("训练集类型:", type(train_img))
print("训练集维度:", train_img.shape)
test_img = mnist.test.images
test_label = mnist.test.labels
print("测试集类型:", type(test_img))
print("测试集维度:", test_img.shape)
print(test_label[0])
LR()
不调包直接实现LR
秋招做网易的线上笔试题的时候,遇到了这题,要求不调包,自己手动实现LR
给的输入数据:
0.1 10 100 5 10 10
0.105 0.956 0.876 0.133 0.249 0
0.195 0.672 0.193 0.016 0.009 0
0.059 0.282 0.709 0.139 0.478 1
0.303 0.39 0.95 0.912 0.522 1
0.59 0.57 0.141 0.959 0.036 1
0.231 0.355 0.305 0.508 0.625 1
0.896 0.415 0.771 0.197 0.826 0
0.051 0.537 0.442 0.46 0.628 0
0.737 0.583 0.09 0.337 0.774 1
0.062 0.217 0.553 0.868 0.87 0
0.13 0.972 0.845 0.737 0.492
0.016 0.009 0.432 0.41 0.092
0.257 0.327 0.451 0.18 0.62
0.774 0.143 0.879 0.123 0.222
0.885 0.114 0.352 0.484 0.367
0.439 0.227 0.675 0.654 0.323
0.778 0.191 0.633 0.628 0.929
0.958 0.231 0.07 0.739 0.34
0.015 0.115 0.154 0.75 0.649
0.283 0.853 0.752 0.915 0.937
第一行为学习率,正则约束,epoch,输入训练数据维度,训练数据个数,测试数据个数
接下来为训练数据,训练数据最后一维为标签
最后是测试数据,需要输出我们预测的标签
实际上给了10个训练数据,10个测试数据,最后会有几组训练数据,我当时写了下提交之后竟然一次通过了…但是不确定下面代码是否有问题
python实现
import math
class lr:
def __init__(self,alpha, lamda, epoch, dim):
self.alpha = alpha
self.lamda = lamda
self.epoch = epoch
self.dim = dim
self.weights = [1 for _ in range(self.dim)]
def get_pred(self,a,b):
result = []
for line in a:
cur = 0
for i in range(len(line)):
cur += line[i]*b[i]
cur = 1/(1+math.exp(-cur))
result.append(cur)
return result
def grad_descent(self, pred, train_y):
bs = len(train_y)
dim = len(self.weights)
hx_y = [pred[i]-train_y[i] for i in range(bs)]
for j in range(dim):
cur = 0
for i in range(bs):
cur += (hx_y[i]*train_data[i][j])
cur += lamda*self.weights[j]
cur = cur*self.alpha/bs
self.weights[j] -= cur
return self.weights
def train(self,train_data,train_y):
for epc in range(self.epoch):
pred = self.get_pred(train_data,self.weights)
self.weights = self.grad_descent(pred, train_y)
def test(self,test_data):
pred_y = self.get_pred(test_data,self.weights)
print(pred_y)
test_y = []
for pred in pred_y:
if pred > 0.5:
test_y.append(1)
else:
test_y.append(0)
return test_y
# 学习率,正则约束,epoch,输入训练数据维度,训练数据,测试数据
alpha, lamda, epoch, dim, train_bs, test_bs = input().strip().split()
alpha = float(alpha)
lamda = float(lamda)
epoch = int(epoch)
dim = int(dim)
train_bs = int(train_bs)
test_bs = int(test_bs)
train_data = []
train_y = []
test_data = []
for i in range(train_bs):
line = list(map(float,input().strip().split()))
train_data.append(line[:dim])
train_y.append(line[-1])
for i in range(test_bs):
test_data.append(list(map(float, input().strip().split())))
my_lr = lr(alpha, lamda, epoch, dim)
my_lr.train(train_data,train_y)
test_y = my_lr.test(test_data)
for pred in test_y:
print(pred)
来源:CSDN
作者:seeInfinite
链接:https://blog.csdn.net/qq_41058526/article/details/104109196