Data Compression - PowerPoint Presentation, Computer Science Engineering Notes | EduRev

: Data Compression - PowerPoint Presentation, Computer Science Engineering Notes | EduRev

 Page 1


1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform 
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted 
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of 
probabilities of the next symbol as skewed as 
possible
– After prediction there is no way to predict more so 
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the 
information the fewer bits are needed to code 
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to 
code a symbol in the model with a
i
having 
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next 
symbol
• Uses adaptive arithmetic coding for each 
context
context a...e...i...r...s...y
the     0   0   5   7   4   7   
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from 
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
Page 2


1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform 
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted 
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of 
probabilities of the next symbol as skewed as 
possible
– After prediction there is no way to predict more so 
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the 
information the fewer bits are needed to code 
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to 
code a symbol in the model with a
i
having 
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next 
symbol
• Uses adaptive arithmetic coding for each 
context
context a...e...i...r...s...y
the     0   0   5   7   4   7   
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from 
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10 
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50 
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few 
symbols are coded in each context, making them 
ineffective because of the zero-frequency 
problem.
• Context saturation
– If there are too few contexts then the contexts 
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is 
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear 
combinations of previous data, but these may require 
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set 
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the 
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
Page 3


1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform 
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted 
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of 
probabilities of the next symbol as skewed as 
possible
– After prediction there is no way to predict more so 
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the 
information the fewer bits are needed to code 
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to 
code a symbol in the model with a
i
having 
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next 
symbol
• Uses adaptive arithmetic coding for each 
context
context a...e...i...r...s...y
the     0   0   5   7   4   7   
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from 
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10 
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50 
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few 
symbols are coded in each context, making them 
ineffective because of the zero-frequency 
problem.
• Context saturation
– If there are too few contexts then the contexts 
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is 
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear 
combinations of previous data, but these may require 
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set 
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the 
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
3
CSEP 590 - Lecture 6 - Autumn 2007 13
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 14
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 15
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 16
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 17
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 18
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
Page 4


1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform 
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted 
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of 
probabilities of the next symbol as skewed as 
possible
– After prediction there is no way to predict more so 
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the 
information the fewer bits are needed to code 
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to 
code a symbol in the model with a
i
having 
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next 
symbol
• Uses adaptive arithmetic coding for each 
context
context a...e...i...r...s...y
the     0   0   5   7   4   7   
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from 
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10 
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50 
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few 
symbols are coded in each context, making them 
ineffective because of the zero-frequency 
problem.
• Context saturation
– If there are too few contexts then the contexts 
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is 
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear 
combinations of previous data, but these may require 
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set 
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the 
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
3
CSEP 590 - Lecture 6 - Autumn 2007 13
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 14
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 15
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 16
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 17
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 18
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
4
CSEP 590 - Lecture 6 - Autumn 2007 19
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2
0   1    2    3
c    b   a    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 20
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
0   1    2    3
c    b   d    a
CSEP 590 - Lecture 6 - Autumn 2007 21
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
Frequencies of {a, b, c, d}
a  b  c  d       
4  7  8  1
Frequencies of {0, 1, 2, 3}
0  1  2  3
8  9  2  1
CSEP 590 - Lecture 6 - Autumn 2007 22
Extreme Example
Input:
aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd
Output
0000000000100000000020000000003000000000
Frequencies of a b c d
a   b   c   d       
10 10 10 10
Frequencies of 0 1 2 3
0   1  2  3
37  1  1  1
CSEP 590 - Lecture 6 - Autumn 2007 23
Burrows-Wheeler Transform
• Burrows-Wheeler, 1994
• BW Transform creates a representation of the 
data which has a small working set.
• The transformed data is compressed with 
move to front compression.
• The decoder is quite different from the 
encoder.
• The algorithm requires processing the entire 
string at once (it is not on-line).
• It is a remarkably good compression method.
CSEP 590 - Lecture 6 - Autumn 2007 24
Encoding Example
• abracadabra
1. Create all cyclic shifts of the string.
0   abracadabra
1   bracadabraa
2   racadabraab
3   acadabraabr
4   cadabraabra
5   adabraabrac
6   dabraabraca
7   abraabracad
8   braabracada
9   raabracadab
10 aabracadabr
Page 5


1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform 
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted 
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of 
probabilities of the next symbol as skewed as 
possible
– After prediction there is no way to predict more so 
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the 
information the fewer bits are needed to code 
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to 
code a symbol in the model with a
i
having 
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next 
symbol
• Uses adaptive arithmetic coding for each 
context
context a...e...i...r...s...y
the     0   0   5   7   4   7   
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from 
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10 
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50 
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few 
symbols are coded in each context, making them 
ineffective because of the zero-frequency 
problem.
• Context saturation
– If there are too few contexts then the contexts 
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is 
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear 
combinations of previous data, but these may require 
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set 
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the 
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
3
CSEP 590 - Lecture 6 - Autumn 2007 13
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 14
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 15
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 16
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 17
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 18
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
4
CSEP 590 - Lecture 6 - Autumn 2007 19
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2
0   1    2    3
c    b   a    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 20
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
0   1    2    3
c    b   d    a
CSEP 590 - Lecture 6 - Autumn 2007 21
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
Frequencies of {a, b, c, d}
a  b  c  d       
4  7  8  1
Frequencies of {0, 1, 2, 3}
0  1  2  3
8  9  2  1
CSEP 590 - Lecture 6 - Autumn 2007 22
Extreme Example
Input:
aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd
Output
0000000000100000000020000000003000000000
Frequencies of a b c d
a   b   c   d       
10 10 10 10
Frequencies of 0 1 2 3
0   1  2  3
37  1  1  1
CSEP 590 - Lecture 6 - Autumn 2007 23
Burrows-Wheeler Transform
• Burrows-Wheeler, 1994
• BW Transform creates a representation of the 
data which has a small working set.
• The transformed data is compressed with 
move to front compression.
• The decoder is quite different from the 
encoder.
• The algorithm requires processing the entire 
string at once (it is not on-line).
• It is a remarkably good compression method.
CSEP 590 - Lecture 6 - Autumn 2007 24
Encoding Example
• abracadabra
1. Create all cyclic shifts of the string.
0   abracadabra
1   bracadabraa
2   racadabraab
3   acadabraabr
4   cadabraabra
5   adabraabrac
6   dabraabraca
7   abraabracad
8   braabracada
9   raabracadab
10 aabracadabr
5
CSEP 590 - Lecture 6 - Autumn 2007 25
Encoding Example
2. Sort the strings alphabetically in to array A
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
0   abracadabra
1   bracadabraa
2   racadabraab
3   acadabraabr
4   cadabraabra
5   adabraabrac
6   dabraabraca
7   abraabracad
8   braabracada
9   raabracadab
10 aabracadabr
A
CSEP 590 - Lecture 6 - Autumn 2007 26
Encoding Example
3. L = the last column
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
L = rdarcaaaabb
CSEP 590 - Lecture 6 - Autumn 2007 27
Encoding Example
4. Transmit X the index of the input in A and L (using 
move to front coding).
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
L = rdarcaaaabb
X = 2
CSEP 590 - Lecture 6 - Autumn 2007 28
Why BW Works
• Ignore decoding for the moment.
• The prefix of each shifted string is a context 
for the last symbol.  
– The last symbol appears just before the prefix in 
the original.
• By sorting, similar contexts are adjacent.  
– This means that the predicted last symbols are 
similar.
CSEP 590 - Lecture 6 - Autumn 2007 29
Decoding Example
• We first decode assuming some information.  We 
then show how to compute the information.
• Let A
s
be A shifted by 1
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
0  raabracadab
1  dabraabraca
2  aabracadabr
3  racadabraab
4  cadabraabra
5  abraabracad
6  abracadabra
7  acadabraabr
8  adabraabrac
9  braabracada
10 bracadabraa
A
s
CSEP 590 - Lecture 6 - Autumn 2007 30
Decoding Example
• Assume we know the mapping T[i] is the index in A
s
of the string i in A.
• T = [2 5 6 7 8 9 10 4 1 0 3]
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
0  raabracadab
1  dabraabraca
2  aabracadabr
3  racadabraab
4  cadabraabra
5  abraabracad
6  abracadabra
7  acadabraabr
8  adabraabrac
9  braabracada
10 bracadabraa
A
s
Read More
Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!