Information Gain Notes | EduRev

: Information Gain Notes | EduRev

 Page 1


1
Information Gain
Which test is more informative?
Split over whether 
Balance exceeds 50K
Over 50K Less or equal 50K
Employed
Unemployed
Split over whether 
applicant is employed
Page 2


1
Information Gain
Which test is more informative?
Split over whether 
Balance exceeds 50K
Over 50K Less or equal 50K
Employed
Unemployed
Split over whether 
applicant is employed
2
Information Gain
Impurity/Entropy (informal)
– Measures the level of impurity in a group 
of examples
Page 3


1
Information Gain
Which test is more informative?
Split over whether 
Balance exceeds 50K
Over 50K Less or equal 50K
Employed
Unemployed
Split over whether 
applicant is employed
2
Information Gain
Impurity/Entropy (informal)
– Measures the level of impurity in a group 
of examples
3
Impurity
Very impure group
Less impure 
Minimum 
impurity
Page 4


1
Information Gain
Which test is more informative?
Split over whether 
Balance exceeds 50K
Over 50K Less or equal 50K
Employed
Unemployed
Split over whether 
applicant is employed
2
Information Gain
Impurity/Entropy (informal)
– Measures the level of impurity in a group 
of examples
3
Impurity
Very impure group
Less impure 
Minimum 
impurity
4
Entropy: a common way to measure impurity
• Entropy = 
p
i
is the probability of class i
Compute it as the proportion of class i in the set.
• Entropy comes from information theory.  The 
higher the entropy the more the information 
content.
?
-
i
i i
p p
2
log
What does that mean for learning from examples?
16/30 are green circles; 14/30 are pink crosses
log
2
(16/30) =  -.9;       log
2
(14/30) =  -1.1 
Entropy = -(16/30)(-.9) –(14/30)(-1.1) = .99 
Page 5


1
Information Gain
Which test is more informative?
Split over whether 
Balance exceeds 50K
Over 50K Less or equal 50K
Employed
Unemployed
Split over whether 
applicant is employed
2
Information Gain
Impurity/Entropy (informal)
– Measures the level of impurity in a group 
of examples
3
Impurity
Very impure group
Less impure 
Minimum 
impurity
4
Entropy: a common way to measure impurity
• Entropy = 
p
i
is the probability of class i
Compute it as the proportion of class i in the set.
• Entropy comes from information theory.  The 
higher the entropy the more the information 
content.
?
-
i
i i
p p
2
log
What does that mean for learning from examples?
16/30 are green circles; 14/30 are pink crosses
log
2
(16/30) =  -.9;       log
2
(14/30) =  -1.1 
Entropy = -(16/30)(-.9) –(14/30)(-1.1) = .99 
5
2-Class Cases:
• What is the entropy of a group in which 
all examples belong to the same 
class?
– entropy = - 1 log
2
1 = 0
• What is the entropy of a group with 
50% in either class?
– entropy = -0.5  log
2
0.5 – 0.5  log
2
0.5 =1
Minimum 
impurity
Maximum
impurity
not a good training set for learning
good training set for learning
Read More
Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!