问题
这里有八名同学在考试前一天的活动以及他们的考试结果如下表所示:
挂科
喝酒
逛街
学习
1
1
1
0
0
0
0
1
0
1
0
1
1
1
0
0
1
0
1
0
0
0
1
1
0
0
1
0
1
0
0
1
通过以上数据,根据朴素贝叶斯原理,判断某学生在没有喝酒,没有逛街并且学习了的情况下是否会挂科。
算法步骤
朴素贝叶斯分类问题的主要目标就是求解P ( y = 1 ∣ x 1 , x 2 , x 3 ) P(y=1|x_1,x_2,x_3) P ( y = 1 ∣ x 1 , x 2 , x 3 ) 以及P ( y = 0 ∣ x 1 , x 2 , x 3 ) P(y=0|x_1,x_2,x_3) P ( y = 0 ∣ x 1 , x 2 , x 3 ) ,通过比较两者大小来做出判断。
在这个问题中,y y y 表示是否挂科,x 1 , x 2 , x 3 x_1, x_2, x_3 x 1 , x 2 , x 3 分别表示是否喝酒、逛街、学习。
我们知道,对于条件概率,有以下公式:P ( A ∣ B ) = P ( A B ) P ( B ) P(A|B)=\frac{P(AB)}{P(B)} P ( A ∣ B ) = P ( B ) P ( A B ) P ( B ∣ A ) = P ( A B ) P ( A ) P(B|A)=\frac{P(AB)}{P(A)} P ( B ∣ A ) = P ( A ) P ( A B )
由此可以推得:P ( A ∣ B ) = P ( B ∣ A ) P ( A ) P ( B ) P(A|B)=\frac{P(B|A)P(A)}{P(B)} P ( A ∣ B ) = P ( B ) P ( B ∣ A ) P ( A )
由此知:P ( y ∣ x 1 , x 2 , x 3 ) = P ( x 1 , x 2 , x 3 ∣ y ) P ( y ) P ( x 1 , x 2 , x 3 ) P(y|x_1,x_2,x_3)=\frac{P(x_1,x_2,x_3|y)P(y)}{P(x_1,x_2,x_3)} P ( y ∣ x 1 , x 2 , x 3 ) = P ( x 1 , x 2 , x 3 ) P ( x 1 , x 2 , x 3 ∣ y ) P ( y )
根据马尔可夫假设,P ( x 1 , x 2 , x 3 ∣ y ) = P ( x 1 ∣ y ) × P ( x 2 ∣ y ) × P ( x 3 ∣ y ) P(x_1,x_2,x_3|y)=P(x_1|y)\times P(x_2|y)\times P(x_3|y) P ( x 1 , x 2 , x 3 ∣ y ) = P ( x 1 ∣ y ) × P ( x 2 ∣ y ) × P ( x 3 ∣ y ) 。
由此可以解出,当x 1 = 0 , x 2 = 0 , x 3 = 1 x_1=0, x_2=0, x_3=1 x 1 = 0 , x 2 = 0 , x 3 = 1 时,y = 0 y=0 y = 0 的概率为:P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 0 ) P ( y = 0 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=0|x_1=0, x_2=0, x_3=1)=\frac{P(x_1=0, x_2=0, x_3=1|y=0)P(y=0)}{P(x_1=0, x_2=0, x_3=1)} P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 0 ) P ( y = 0 ) P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 1 ) P ( y = 1 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=1|x_1=0, x_2=0, x_3=1)=\frac{P(x_1=0, x_2=0, x_3=1|y=1)P(y=1)}{P(x_1=0, x_2=0, x_3=1)} P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 1 ) P ( y = 1 )
因为P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 0 ) = P ( x 1 = 0 ∣ y = 0 ) × P ( x 2 = 0 ∣ y = 0 ) × P ( x 3 = 1 ∣ y = 0 ) = 4 64 P(x_1=0, x_2=0, x_3=1|y=0)=P(x_1=0|y=0)\times P(x_2=0|y=0)\times P(x_3=1|y=0)=\frac{4}{64} P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 0 ) = P ( x 1 = 0 ∣ y = 0 ) × P ( x 2 = 0 ∣ y = 0 ) × P ( x 3 = 1 ∣ y = 0 ) = 6 4 4 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 1 ) = P ( x 1 = 0 ∣ y = 1 ) × P ( x 2 = 0 ∣ y = 1 ) × P ( x 3 = 1 ∣ y = 1 ) = 18 64 P(x_1=0, x_2=0, x_3=1|y=1)=P(x_1=0|y=1)\times P(x_2=0|y=1)\times P(x_3=1|y=1)=\frac{18}{64} P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 1 ) = P ( x 1 = 0 ∣ y = 1 ) × P ( x 2 = 0 ∣ y = 1 ) × P ( x 3 = 1 ∣ y = 1 ) = 6 4 1 8
所以:P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 4 64 P ( y = 0 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=0|x_1=0, x_2=0, x_3=1)=\frac{\frac{4}{64}P(y=0)}{P(x_1=0, x_2=0, x_3=1)} P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) 6 4 4 P ( y = 0 ) P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 18 64 P ( y = 1 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=1|x_1=0, x_2=0, x_3=1)=\frac{\frac{18}{64}P(y=1)}{P(x_1=0, x_2=0, x_3=1)} P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) 6 4 1 8 P ( y = 1 )
由于P ( y = 0 ) = P ( y = 1 ) = 1 2 P(y=0)=P(y=1)=\frac{1}{2} P ( y = 0 ) = P ( y = 1 ) = 2 1
所以得到P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 4 128 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=0|x_1=0, x_2=0, x_3=1)=\frac{4}{128P(x_1=0, x_2=0, x_3=1)} P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 1 2 8 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) 4 P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 18 128 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=1|x_1=0, x_2=0, x_3=1)=\frac{18}{128P(x_1=0, x_2=0, x_3=1)} P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 1 2 8 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) 1 8
由此可知此学生不挂科的概率要更大一些,故将y y y 判断为y = 0 y=0 y = 0 。
python实现
1、导入需要的库
import numpy as np
from sklearn. naive_bayes import GaussianNB
from sklearn. datasets import load_digits
from sklearn. model_selection import train_test_split
from sklearn. metrics import confusion_matrix as CM
2、导入数据并划分训练集和测试集
digits = load_digits( )
X, y = digits. data, digits. target
Xtrain, Xtest, Ytrain, Ytest = train_test_split( X, y, test_size= 0.3 , random_state= 420 )
print ( Xtrain. shape)
print ( Xtest. shape)
print ( Ytrain. shape)
print ( Ytest. shape)
3、朴素贝叶斯分类
gnb = GaussianNB( ) . fit( Xtrain, Ytrain)
acc_score = gnb. score( Xtest, Ytest)
print ( acc_score)
Y_pred = gnb. predict( Xtest)
print ( Y_pred)
prob = gnb. predict_proba( Xtest)
print ( prob. shape)
4、使用混淆矩阵来查看贝叶斯的分类结果
CM( Ytest, Y_pred)