gini

监督学习方法

蓝咒 提交于 2020-04-11 12:49:33
学习资料:《统计学习方法 第二版》、《机器学习实战》、吴恩达机器学习课程 一. 感知机Proceptron 感知机是根据输入实例的特征向量 \(x\) 对其进行二类分类的线性分类模型: \(f(x)=\operatorname{sign}(w \cdot x+b)\) ,感知机模型对应于输入空间(特征空间)中的分离超平面 \(w \cdot x+b=0\) 。 感知机学习的策略是极小化损失函数: \(\min _{w, b} L(w, b)=-\sum_{x_{i} \in M} ;y_{i}\left(w \cdot x_{i}+b\right)\) ; 损失函数对应于误分类点到分离超平面的总距离。 感知机学习算法是基于随机梯度下降法的对损失函数的最优化算法。对于所有误分类的点,计算这些点到超平面的距离,目的是最小化这些点到平面的距离。 当训练数据集线性可分时,感知机学习算法存在无穷多个解,其解由于不同的初值或不同的迭代顺序而可能有所不同。 二. K近邻算法 K-近邻算法是一种 没有显示学习过程的算法 。数据集的可代表性就很重要! K-近邻原理:把数据集和输入实例点都映射到空间内,对给定的输入实例点,首先确定输入实例点的𝑘个最近邻训练实例点,这𝑘个训练实例点的类的多数就是预测的输入实例点的类。 K-近邻算法的 核心要素是K的值、距离度量(一般为欧式距离)、分类决策规则

Weighted Gini coefficient in Python

删除回忆录丶 提交于 2019-12-11 20:04:38
问题 Here's a simple implementation of the Gini coefficient in Python, from https://stackoverflow.com/a/39513799/1840471: def gini(x): # Mean absolute difference. mad = np.abs(np.subtract.outer(x, x)).mean() # Relative mean absolute difference rmad = mad / np.mean(x) # Gini coefficient is half the relative mean absolute difference. return 0.5 * rmad How can this be adjusted to take an array of weights as a second vector? This should take noninteger weights, so not just blow up the array by the

Custom loss function in Keras, how to deal with placeholders

痴心易碎 提交于 2019-12-05 02:59:29
问题 I am trying to generate a custom loss function in TF/Keras,the loss function works if it is run in a session and passed constants, however, it stops working when compiled into a Keras. The cost function (thanks to Lior for converting it to TF) def ginicTF(actual,pred): n = int(actual.get_shape()[-1]) inds = K.reverse(tf.nn.top_k(pred,n)[1],axes=[0]) a_s = K.gather(actual,inds) a_c = K.cumsum(a_s) giniSum = K.sum(a_c)/K.sum(a_s) - (n+1)/2.0 return giniSum / n def gini_normalizedTF(a,p): return

Custom loss function in Keras, how to deal with placeholders

不羁的心 提交于 2019-12-03 17:28:17
I am trying to generate a custom loss function in TF/Keras,the loss function works if it is run in a session and passed constants, however, it stops working when compiled into a Keras. The cost function (thanks to Lior for converting it to TF) def ginicTF(actual,pred): n = int(actual.get_shape()[-1]) inds = K.reverse(tf.nn.top_k(pred,n)[1],axes=[0]) a_s = K.gather(actual,inds) a_c = K.cumsum(a_s) giniSum = K.sum(a_c)/K.sum(a_s) - (n+1)/2.0 return giniSum / n def gini_normalizedTF(a,p): return -ginicTF(a, p) / ginicTF(a, a) #Test the cost function sess = tf.InteractiveSession() p = [0.9, 0.3, 0

More efficient weighted Gini coefficient in Python

为君一笑 提交于 2019-11-28 03:49:32
问题 Per https://stackoverflow.com/a/48981834/1840471, this is an implementation of the weighted Gini coefficient in Python: import numpy as np def gini(x, weights=None): if weights is None: weights = np.ones_like(x) # Calculate mean absolute deviation in two steps, for weights. count = np.multiply.outer(weights, weights) mad = np.abs(np.subtract.outer(x, x) * count).sum() / count.sum() rmad = mad / np.average(x, weights=weights) # Gini equals half the relative mean absolute deviation. return 0.5