gini | 易学教程

监督学习方法

阅读更多关于监督学习方法

学习资料：《统计学习方法第二版》、《机器学习实战》、吴恩达机器学习课程一. 感知机Proceptron 感知机是根据输入实例的特征向量 \(x\) 对其进行二类分类的线性分类模型： \(f(x)=\operatorname{sign}(w \cdot x+b)\) ，感知机模型对应于输入空间（特征空间）中的分离超平面 \(w \cdot x+b=0\) 。感知机学习的策略是极小化损失函数： \(\min _{w, b} L(w, b)=-\sum_{x_{i} \in M} ;y_{i}\left(w \cdot x_{i}+b\right)\) ；损失函数对应于误分类点到分离超平面的总距离。感知机学习算法是基于随机梯度下降法的对损失函数的最优化算法。对于所有误分类的点，计算这些点到超平面的距离，目的是最小化这些点到平面的距离。当训练数据集线性可分时，感知机学习算法存在无穷多个解，其解由于不同的初值或不同的迭代顺序而可能有所不同。二. K近邻算法 K-近邻算法是一种没有显示学习过程的算法。数据集的可代表性就很重要！ K-近邻原理：把数据集和输入实例点都映射到空间内，对给定的输入实例点，首先确定输入实例点的𝑘个最近邻训练实例点，这𝑘个训练实例点的类的多数就是预测的输入实例点的类。 K-近邻算法的核心要素是K的值、距离度量(一般为欧式距离)、分类决策规则

Weighted Gini coefficient in Python

阅读更多关于 Weighted Gini coefficient in Python

问题 Here's a simple implementation of the Gini coefficient in Python, from https://stackoverflow.com/a/39513799/1840471: def gini(x): # Mean absolute difference. mad = np.abs(np.subtract.outer(x, x)).mean() # Relative mean absolute difference rmad = mad / np.mean(x) # Gini coefficient is half the relative mean absolute difference. return 0.5 * rmad How can this be adjusted to take an array of weights as a second vector? This should take noninteger weights, so not just blow up the array by the

Custom loss function in Keras, how to deal with placeholders

阅读更多关于 Custom loss function in Keras, how to deal with placeholders

问题 I am trying to generate a custom loss function in TF/Keras,the loss function works if it is run in a session and passed constants, however, it stops working when compiled into a Keras. The cost function (thanks to Lior for converting it to TF) def ginicTF(actual,pred): n = int(actual.get_shape()[-1]) inds = K.reverse(tf.nn.top_k(pred,n)[1],axes=[0]) a_s = K.gather(actual,inds) a_c = K.cumsum(a_s) giniSum = K.sum(a_c)/K.sum(a_s) - (n+1)/2.0 return giniSum / n def gini_normalizedTF(a,p): return

Custom loss function in Keras, how to deal with placeholders

阅读更多关于 Custom loss function in Keras, how to deal with placeholders

I am trying to generate a custom loss function in TF/Keras,the loss function works if it is run in a session and passed constants, however, it stops working when compiled into a Keras. The cost function (thanks to Lior for converting it to TF) def ginicTF(actual,pred): n = int(actual.get_shape()[-1]) inds = K.reverse(tf.nn.top_k(pred,n)[1],axes=[0]) a_s = K.gather(actual,inds) a_c = K.cumsum(a_s) giniSum = K.sum(a_c)/K.sum(a_s) - (n+1)/2.0 return giniSum / n def gini_normalizedTF(a,p): return -ginicTF(a, p) / ginicTF(a, a) #Test the cost function sess = tf.InteractiveSession() p = [0.9, 0.3, 0

More efficient weighted Gini coefficient in Python

阅读更多关于 More efficient weighted Gini coefficient in Python

问题 Per https://stackoverflow.com/a/48981834/1840471, this is an implementation of the weighted Gini coefficient in Python: import numpy as np def gini(x, weights=None): if weights is None: weights = np.ones_like(x) # Calculate mean absolute deviation in two steps, for weights. count = np.multiply.outer(weights, weights) mad = np.abs(np.subtract.outer(x, x) * count).sum() / count.sum() rmad = mad / np.average(x, weights=weights) # Gini equals half the relative mean absolute deviation. return 0.5