Page 1 1 CSEP 590 Data Compression Autumn 2007 Adaptive Huffman Coding CSEP 590 - Lecture 2 - Autumn 2007 2 Adaptive Huffman Coding • One pass • During the pass calculate the frequencies • Update the Huffman tree accordingly – Coder – new Huffman tree computed after transmitting the symbol – Decoder – new Huffman tree computed after receiving the symbol • Symbol set and their initial codes must be known ahead of time. • Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree. CSEP 590 - Lecture 2 - Autumn 2007 3 Optimal Tree Numbering • a : 5, b: 2, c : 1, d : 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 4 Weight the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 5 Number the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d Number the nodes as they are removed from the priority queue. 2 7 5 6 3 4 1 CSEP 590 - Lecture 2 - Autumn 2007 6 Adaptive Huffman Principle • In an optimal tree for n symbols there is a numbering of the nodes y 1 <y 2 <... <y 2n-1 such that their corresponding weights x 1 ,x 2 , ... , x 2n-1 satisfy: – x 1 < x 2 < ... < x 2n-1 – siblings are numbered consecutively • And vice versa – That is, if there is such a numbering then the tree is optimal. We call this the node number invariant. Page 2 1 CSEP 590 Data Compression Autumn 2007 Adaptive Huffman Coding CSEP 590 - Lecture 2 - Autumn 2007 2 Adaptive Huffman Coding • One pass • During the pass calculate the frequencies • Update the Huffman tree accordingly – Coder – new Huffman tree computed after transmitting the symbol – Decoder – new Huffman tree computed after receiving the symbol • Symbol set and their initial codes must be known ahead of time. • Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree. CSEP 590 - Lecture 2 - Autumn 2007 3 Optimal Tree Numbering • a : 5, b: 2, c : 1, d : 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 4 Weight the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 5 Number the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d Number the nodes as they are removed from the priority queue. 2 7 5 6 3 4 1 CSEP 590 - Lecture 2 - Autumn 2007 6 Adaptive Huffman Principle • In an optimal tree for n symbols there is a numbering of the nodes y 1 <y 2 <... <y 2n-1 such that their corresponding weights x 1 ,x 2 , ... , x 2n-1 satisfy: – x 1 < x 2 < ... < x 2n-1 – siblings are numbered consecutively • And vice versa – That is, if there is such a numbering then the tree is optimal. We call this the node number invariant. 2 CSEP 590 - Lecture 2 - Autumn 2007 7 Initialization • Symbols a 1 , a 2 , ... ,a m have a basic prefix code, used when symbols are first encountered. • Example: a, b ,c, d, e, f, g, h, i, j 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 a h g f e d c b j i CSEP 590 - Lecture 2 - Autumn 2007 8 Initialization • The tree will encode up to m + 1 symbols including NYT. • We reserve numbers 1 to 2m + 1 for node numbering. • The initial Huffman tree consists of a single node 0 2m + 1 weight node number NYT CSEP 590 - Lecture 2 - Autumn 2007 9 Coding Algorithm 1. If a new symbol is encountered then output the code for NYT followed by the fixed code for the symbol. Add the new symbol to the tree. 2. If an old symbol is encountered then output its code. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 10 Decoding Algorithm 1. Decode the symbol using the current tree. 2. If NYT is encountered then use the fixed code to decode the symbol. Add the new symbol to the tree. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 11 Updating the Tree 1. Let y be leaf (symbol) with current weight x.* 2. If y the root update x by 1, otherwise, 3. Exchange y with the largest numbered node with the same weight (unless it is the parent).** 4. Update x by 1 5. Let y be the parent with its weight x and go to 2. *We never update the weight of NYT ** This exchange will preserve the node number invariant CSEP 590 - Lecture 2 - Autumn 2007 12 Example • aabcdad in alphabet {a,b,..., j} 0 21 output = 000 fixed code NYT fixed code for a Page 3 1 CSEP 590 Data Compression Autumn 2007 Adaptive Huffman Coding CSEP 590 - Lecture 2 - Autumn 2007 2 Adaptive Huffman Coding • One pass • During the pass calculate the frequencies • Update the Huffman tree accordingly – Coder – new Huffman tree computed after transmitting the symbol – Decoder – new Huffman tree computed after receiving the symbol • Symbol set and their initial codes must be known ahead of time. • Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree. CSEP 590 - Lecture 2 - Autumn 2007 3 Optimal Tree Numbering • a : 5, b: 2, c : 1, d : 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 4 Weight the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 5 Number the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d Number the nodes as they are removed from the priority queue. 2 7 5 6 3 4 1 CSEP 590 - Lecture 2 - Autumn 2007 6 Adaptive Huffman Principle • In an optimal tree for n symbols there is a numbering of the nodes y 1 <y 2 <... <y 2n-1 such that their corresponding weights x 1 ,x 2 , ... , x 2n-1 satisfy: – x 1 < x 2 < ... < x 2n-1 – siblings are numbered consecutively • And vice versa – That is, if there is such a numbering then the tree is optimal. We call this the node number invariant. 2 CSEP 590 - Lecture 2 - Autumn 2007 7 Initialization • Symbols a 1 , a 2 , ... ,a m have a basic prefix code, used when symbols are first encountered. • Example: a, b ,c, d, e, f, g, h, i, j 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 a h g f e d c b j i CSEP 590 - Lecture 2 - Autumn 2007 8 Initialization • The tree will encode up to m + 1 symbols including NYT. • We reserve numbers 1 to 2m + 1 for node numbering. • The initial Huffman tree consists of a single node 0 2m + 1 weight node number NYT CSEP 590 - Lecture 2 - Autumn 2007 9 Coding Algorithm 1. If a new symbol is encountered then output the code for NYT followed by the fixed code for the symbol. Add the new symbol to the tree. 2. If an old symbol is encountered then output its code. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 10 Decoding Algorithm 1. Decode the symbol using the current tree. 2. If NYT is encountered then use the fixed code to decode the symbol. Add the new symbol to the tree. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 11 Updating the Tree 1. Let y be leaf (symbol) with current weight x.* 2. If y the root update x by 1, otherwise, 3. Exchange y with the largest numbered node with the same weight (unless it is the parent).** 4. Update x by 1 5. Let y be the parent with its weight x and go to 2. *We never update the weight of NYT ** This exchange will preserve the node number invariant CSEP 590 - Lecture 2 - Autumn 2007 12 Example • aabcdad in alphabet {a,b,..., j} 0 21 output = 000 fixed code NYT fixed code for a 3 CSEP 590 - Lecture 2 - Autumn 2007 13 Example • aabcdad 0 19 output = 000 1 1 20 0 1 21 a NYT CSEP 590 - Lecture 2 - Autumn 2007 14 Example • aabcdad 0 output = 0001 1 1 0 1 21 a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 15 Example • aabcdad 0 output = 0001 2 1 0 1 21 a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 16 Example • aabcdad 0 output = 00010001 2 2 0 1 21 NYT fixed code for b a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 17 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 0 17 0 18 0 1 b NYT CSEP 590 - Lecture 2 - Autumn 2007 18 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 0 17 1 18 0 1 b NYT Page 4 1 CSEP 590 Data Compression Autumn 2007 Adaptive Huffman Coding CSEP 590 - Lecture 2 - Autumn 2007 2 Adaptive Huffman Coding • One pass • During the pass calculate the frequencies • Update the Huffman tree accordingly – Coder – new Huffman tree computed after transmitting the symbol – Decoder – new Huffman tree computed after receiving the symbol • Symbol set and their initial codes must be known ahead of time. • Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree. CSEP 590 - Lecture 2 - Autumn 2007 3 Optimal Tree Numbering • a : 5, b: 2, c : 1, d : 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 4 Weight the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 5 Number the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d Number the nodes as they are removed from the priority queue. 2 7 5 6 3 4 1 CSEP 590 - Lecture 2 - Autumn 2007 6 Adaptive Huffman Principle • In an optimal tree for n symbols there is a numbering of the nodes y 1 <y 2 <... <y 2n-1 such that their corresponding weights x 1 ,x 2 , ... , x 2n-1 satisfy: – x 1 < x 2 < ... < x 2n-1 – siblings are numbered consecutively • And vice versa – That is, if there is such a numbering then the tree is optimal. We call this the node number invariant. 2 CSEP 590 - Lecture 2 - Autumn 2007 7 Initialization • Symbols a 1 , a 2 , ... ,a m have a basic prefix code, used when symbols are first encountered. • Example: a, b ,c, d, e, f, g, h, i, j 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 a h g f e d c b j i CSEP 590 - Lecture 2 - Autumn 2007 8 Initialization • The tree will encode up to m + 1 symbols including NYT. • We reserve numbers 1 to 2m + 1 for node numbering. • The initial Huffman tree consists of a single node 0 2m + 1 weight node number NYT CSEP 590 - Lecture 2 - Autumn 2007 9 Coding Algorithm 1. If a new symbol is encountered then output the code for NYT followed by the fixed code for the symbol. Add the new symbol to the tree. 2. If an old symbol is encountered then output its code. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 10 Decoding Algorithm 1. Decode the symbol using the current tree. 2. If NYT is encountered then use the fixed code to decode the symbol. Add the new symbol to the tree. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 11 Updating the Tree 1. Let y be leaf (symbol) with current weight x.* 2. If y the root update x by 1, otherwise, 3. Exchange y with the largest numbered node with the same weight (unless it is the parent).** 4. Update x by 1 5. Let y be the parent with its weight x and go to 2. *We never update the weight of NYT ** This exchange will preserve the node number invariant CSEP 590 - Lecture 2 - Autumn 2007 12 Example • aabcdad in alphabet {a,b,..., j} 0 21 output = 000 fixed code NYT fixed code for a 3 CSEP 590 - Lecture 2 - Autumn 2007 13 Example • aabcdad 0 19 output = 000 1 1 20 0 1 21 a NYT CSEP 590 - Lecture 2 - Autumn 2007 14 Example • aabcdad 0 output = 0001 1 1 0 1 21 a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 15 Example • aabcdad 0 output = 0001 2 1 0 1 21 a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 16 Example • aabcdad 0 output = 00010001 2 2 0 1 21 NYT fixed code for b a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 17 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 0 17 0 18 0 1 b NYT CSEP 590 - Lecture 2 - Autumn 2007 18 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 0 17 1 18 0 1 b NYT 4 CSEP 590 - Lecture 2 - Autumn 2007 19 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 1 17 1 18 0 1 b NYT CSEP 590 - Lecture 2 - Autumn 2007 20 Example • aabcdad 2 3 0 1 21 a 20 0 19 a 1 17 1 18 0 1 b output = 0001000100010 NYT fixed code for c NYT CSEP 590 - Lecture 2 - Autumn 2007 21 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 1 17 1 18 0 1 0 a 0 15 0 16 0 1 NYT a b c CSEP 590 - Lecture 2 - Autumn 2007 22 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 1 17 1 18 0 1 0 a 0 15 1 16 0 1 NYT a b c CSEP 590 - Lecture 2 - Autumn 2007 23 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 1 17 1 18 0 1 0 a 1 15 1 16 0 1 NYT a b c CSEP 590 - Lecture 2 - Autumn 2007 24 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 2 17 1 18 0 1 0 a 1 15 1 16 0 1 NYT a b c Page 5 1 CSEP 590 Data Compression Autumn 2007 Adaptive Huffman Coding CSEP 590 - Lecture 2 - Autumn 2007 2 Adaptive Huffman Coding • One pass • During the pass calculate the frequencies • Update the Huffman tree accordingly – Coder – new Huffman tree computed after transmitting the symbol – Decoder – new Huffman tree computed after receiving the symbol • Symbol set and their initial codes must be known ahead of time. • Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree. CSEP 590 - Lecture 2 - Autumn 2007 3 Optimal Tree Numbering • a : 5, b: 2, c : 1, d : 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 4 Weight the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d CSEP 590 - Lecture 2 - Autumn 2007 5 Number the Nodes • a : 5, b: 2, c : 1, d : 3 11 6 3 1 2 5 3 a c b d Number the nodes as they are removed from the priority queue. 2 7 5 6 3 4 1 CSEP 590 - Lecture 2 - Autumn 2007 6 Adaptive Huffman Principle • In an optimal tree for n symbols there is a numbering of the nodes y 1 <y 2 <... <y 2n-1 such that their corresponding weights x 1 ,x 2 , ... , x 2n-1 satisfy: – x 1 < x 2 < ... < x 2n-1 – siblings are numbered consecutively • And vice versa – That is, if there is such a numbering then the tree is optimal. We call this the node number invariant. 2 CSEP 590 - Lecture 2 - Autumn 2007 7 Initialization • Symbols a 1 , a 2 , ... ,a m have a basic prefix code, used when symbols are first encountered. • Example: a, b ,c, d, e, f, g, h, i, j 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 a h g f e d c b j i CSEP 590 - Lecture 2 - Autumn 2007 8 Initialization • The tree will encode up to m + 1 symbols including NYT. • We reserve numbers 1 to 2m + 1 for node numbering. • The initial Huffman tree consists of a single node 0 2m + 1 weight node number NYT CSEP 590 - Lecture 2 - Autumn 2007 9 Coding Algorithm 1. If a new symbol is encountered then output the code for NYT followed by the fixed code for the symbol. Add the new symbol to the tree. 2. If an old symbol is encountered then output its code. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 10 Decoding Algorithm 1. Decode the symbol using the current tree. 2. If NYT is encountered then use the fixed code to decode the symbol. Add the new symbol to the tree. 3. Update the tree to preserve the node number invariant. CSEP 590 - Lecture 2 - Autumn 2007 11 Updating the Tree 1. Let y be leaf (symbol) with current weight x.* 2. If y the root update x by 1, otherwise, 3. Exchange y with the largest numbered node with the same weight (unless it is the parent).** 4. Update x by 1 5. Let y be the parent with its weight x and go to 2. *We never update the weight of NYT ** This exchange will preserve the node number invariant CSEP 590 - Lecture 2 - Autumn 2007 12 Example • aabcdad in alphabet {a,b,..., j} 0 21 output = 000 fixed code NYT fixed code for a 3 CSEP 590 - Lecture 2 - Autumn 2007 13 Example • aabcdad 0 19 output = 000 1 1 20 0 1 21 a NYT CSEP 590 - Lecture 2 - Autumn 2007 14 Example • aabcdad 0 output = 0001 1 1 0 1 21 a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 15 Example • aabcdad 0 output = 0001 2 1 0 1 21 a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 16 Example • aabcdad 0 output = 00010001 2 2 0 1 21 NYT fixed code for b a NYT 19 20 CSEP 590 - Lecture 2 - Autumn 2007 17 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 0 17 0 18 0 1 b NYT CSEP 590 - Lecture 2 - Autumn 2007 18 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 0 17 1 18 0 1 b NYT 4 CSEP 590 - Lecture 2 - Autumn 2007 19 Example • aabcdad output = 00010001 2 2 0 1 21 a 20 0 19 a 1 17 1 18 0 1 b NYT CSEP 590 - Lecture 2 - Autumn 2007 20 Example • aabcdad 2 3 0 1 21 a 20 0 19 a 1 17 1 18 0 1 b output = 0001000100010 NYT fixed code for c NYT CSEP 590 - Lecture 2 - Autumn 2007 21 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 1 17 1 18 0 1 0 a 0 15 0 16 0 1 NYT a b c CSEP 590 - Lecture 2 - Autumn 2007 22 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 1 17 1 18 0 1 0 a 0 15 1 16 0 1 NYT a b c CSEP 590 - Lecture 2 - Autumn 2007 23 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 1 17 1 18 0 1 0 a 1 15 1 16 0 1 NYT a b c CSEP 590 - Lecture 2 - Autumn 2007 24 Example • aabcdad 19 output = 0001000100010 2 3 20 0 1 21 a 2 17 1 18 0 1 0 a 1 15 1 16 0 1 NYT a b c 5 CSEP 590 - Lecture 2 - Autumn 2007 25 Example • aabcdad 19 2 4 20 0 1 21 a 2 17 1 18 0 1 0 a 1 15 1 16 0 1 NYT a b c output = 0001000100010000011 NYT fixed code for d CSEP 590 - Lecture 2 - Autumn 2007 26 Example • aabcdad 19 2 4 20 0 1 21 a 2 17 1 18 0 1 a 1 15 1 16 0 1 NYT a b c output = 0001000100010000011 0 a 13 0 14 0 1 d 0 CSEP 590 - Lecture 2 - Autumn 2007 27 Example • aabcdad 19 2 4 20 0 1 21 a 2 17 1 18 0 1 a 1 15 1 16 0 1 NYT a b c output = 0001000100010000011 0 a 13 1 14 0 1 d 0 CSEP 590 - Lecture 2 - Autumn 2007 28 Example • aabcdad 19 2 4 20 0 1 21 a 2 17 1 18 0 1 a 1 15 1 16 0 1 NYT a b c output = 0001000100010000011 0 a 13 1 14 0 1 d 1 exchange! CSEP 590 - Lecture 2 - Autumn 2007 29 Example • aabcdad 19 2 4 20 0 1 21 2 17 1 18 0 1 a 1 15 1 16 0 1 NYT a b c output = 0001000100010000011 0 a 13 1 14 0 1 d 1 CSEP 590 - Lecture 2 - Autumn 2007 30 Example • aabcdad 19 2 4 20 0 1 21 2 17 1 18 0 1 a 2 15 1 16 0 1 NYT a b c output = 0001000100010000011 0 a 13 1 14 0 1 d 1 exchange!Read More

Offer running on EduRev: __Apply code STAYHOME200__ to get INR 200 off on our premium plan EduRev Infinity!