Page 1 1 Information Gain Which test is more informative? Split over whether Balance exceeds 50K Over 50K Less or equal 50K Employed Unemployed Split over whether applicant is employed Page 2 1 Information Gain Which test is more informative? Split over whether Balance exceeds 50K Over 50K Less or equal 50K Employed Unemployed Split over whether applicant is employed 2 Information Gain Impurity/Entropy (informal) – Measures the level of impurity in a group of examples Page 3 1 Information Gain Which test is more informative? Split over whether Balance exceeds 50K Over 50K Less or equal 50K Employed Unemployed Split over whether applicant is employed 2 Information Gain Impurity/Entropy (informal) – Measures the level of impurity in a group of examples 3 Impurity Very impure group Less impure Minimum impurity Page 4 1 Information Gain Which test is more informative? Split over whether Balance exceeds 50K Over 50K Less or equal 50K Employed Unemployed Split over whether applicant is employed 2 Information Gain Impurity/Entropy (informal) – Measures the level of impurity in a group of examples 3 Impurity Very impure group Less impure Minimum impurity 4 Entropy: a common way to measure impurity • Entropy = p i is the probability of class i Compute it as the proportion of class i in the set. • Entropy comes from information theory. The higher the entropy the more the information content. ? - i i i p p 2 log What does that mean for learning from examples? 16/30 are green circles; 14/30 are pink crosses log 2 (16/30) = -.9; log 2 (14/30) = -1.1 Entropy = -(16/30)(-.9) –(14/30)(-1.1) = .99 Page 5 1 Information Gain Which test is more informative? Split over whether Balance exceeds 50K Over 50K Less or equal 50K Employed Unemployed Split over whether applicant is employed 2 Information Gain Impurity/Entropy (informal) – Measures the level of impurity in a group of examples 3 Impurity Very impure group Less impure Minimum impurity 4 Entropy: a common way to measure impurity • Entropy = p i is the probability of class i Compute it as the proportion of class i in the set. • Entropy comes from information theory. The higher the entropy the more the information content. ? - i i i p p 2 log What does that mean for learning from examples? 16/30 are green circles; 14/30 are pink crosses log 2 (16/30) = -.9; log 2 (14/30) = -1.1 Entropy = -(16/30)(-.9) –(14/30)(-1.1) = .99 5 2-Class Cases: • What is the entropy of a group in which all examples belong to the same class? – entropy = - 1 log 2 1 = 0 • What is the entropy of a group with 50% in either class? – entropy = -0.5 log 2 0.5 – 0.5 log 2 0.5 =1 Minimum impurity Maximum impurity not a good training set for learning good training set for learningRead More

Offer running on EduRev: __Apply code STAYHOME200__ to get INR 200 off on our premium plan EduRev Infinity!