Entropy and Information Gain

感情迁移 提交于 2019-12-14 02:17:35

问题


Simple question I hope.

If I have a set of data like this:

Classification  attribute-1  attribute-2

Correct         dog          dog 
Correct         dog          dog
Wrong           dog          cat 
Correct         cat          cat
Wrong           cat          dog
Wrong           cat          dog

Then what is the information gain of attribute-2 relative to attribute-1?

I've computed the entropy of the whole data set: -(3/6)log2(3/6)-(3/6)log2(3/6)=1

Then I'm stuck! I think you need to calculate entropies of attribute-1 and attribute-2 too? Then use these three calculations in an information gain calculation?

Any help would be great,

Thank you :).


回答1:


Well first you have to calculate the entropy for each of the attributes. After that you calculate the information gain. Just give me a moment and I'll show how it should be done.

for attribute-1

attr-1=dog:
info([2c,1w])=entropy(2/3,1/3)

attr-1=cat
info([1c,2w])=entropy(1/3,2/3)

Value for attribute-1:

info([2c,1w],[1c,2w])=(3/6)*info([2c,1w])+(3/6)*info([1c,2w])

Gain for attribute-1:

gain("attr-1")=info[3c,3w]-info([2c,1w],[1c,2w])

And you have to do the same for the next attribute.



来源:https://stackoverflow.com/questions/5465447/entropy-and-information-gain

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!