Courses

Still More Stream Mining - PowerPoint Presentation Notes | EduRev

: Still More Stream Mining - PowerPoint Presentation Notes | EduRev

``` Page 1

1
Still More Stream-Mining
Frequent Itemsets
Elephants and Troops
Exponentially Decaying Windows
Page 2

1
Still More Stream-Mining
Frequent Itemsets
Elephants and Troops
Exponentially Decaying Windows
2
Counting Items
Problem: given a stream, which items
appear more than s times in the
window?
Possible solution: think of the stream of
baskets as one binary stream per item.
 1 = item present; 0 = not present.
 Use DGIM to estimate counts of 1’s for all
items.
Page 3

1
Still More Stream-Mining
Frequent Itemsets
Elephants and Troops
Exponentially Decaying Windows
2
Counting Items
Problem: given a stream, which items
appear more than s times in the
window?
Possible solution: think of the stream of
baskets as one binary stream per item.
 1 = item present; 0 = not present.
 Use DGIM to estimate counts of 1’s for all
items.
3
Extensions
 In principle, you could count frequent
pairs or even larger sets the same way.
 One stream per itemset.
 Drawbacks:
1. Only approximate.
2. Number of itemsets is way too big.
Page 4

1
Still More Stream-Mining
Frequent Itemsets
Elephants and Troops
Exponentially Decaying Windows
2
Counting Items
Problem: given a stream, which items
appear more than s times in the
window?
Possible solution: think of the stream of
baskets as one binary stream per item.
 1 = item present; 0 = not present.
 Use DGIM to estimate counts of 1’s for all
items.
3
Extensions
 In principle, you could count frequent
pairs or even larger sets the same way.
 One stream per itemset.
 Drawbacks:
1. Only approximate.
2. Number of itemsets is way too big.
4
Approaches
1. “Elephants and troops”: a heuristic
way to converge on unusually strongly
connected itemsets.
2. Exponentially decaying windows: a
heuristic for selecting likely frequent
itemsets.
Page 5

1
Still More Stream-Mining
Frequent Itemsets
Elephants and Troops
Exponentially Decaying Windows
2
Counting Items
Problem: given a stream, which items
appear more than s times in the
window?
Possible solution: think of the stream of
baskets as one binary stream per item.
 1 = item present; 0 = not present.
 Use DGIM to estimate counts of 1’s for all
items.
3
Extensions
 In principle, you could count frequent
pairs or even larger sets the same way.
 One stream per itemset.
 Drawbacks:
1. Only approximate.
2. Number of itemsets is way too big.
4
Approaches
1. “Elephants and troops”: a heuristic
way to converge on unusually strongly
connected itemsets.
2. Exponentially decaying windows: a
heuristic for selecting likely frequent
itemsets.
5
Elephants and Troops
When Sergey Brin wasn’t worrying