Courses

# Data Compression - PowerPoint Presentation, Computer Science Engineering Notes | EduRev

## : Data Compression - PowerPoint Presentation, Computer Science Engineering Notes | EduRev

``` Page 1

1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of
probabilities of the next symbol as skewed as
possible
– After prediction there is no way to predict more so
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the
information the fewer bits are needed to code
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to
code a symbol in the model with a
i
having
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next
symbol
• Uses adaptive arithmetic coding for each
context
context a...e...i...r...s...y
the     0   0   5   7   4   7
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
Page 2

1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of
probabilities of the next symbol as skewed as
possible
– After prediction there is no way to predict more so
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the
information the fewer bits are needed to code
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to
code a symbol in the model with a
i
having
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next
symbol
• Uses adaptive arithmetic coding for each
context
context a...e...i...r...s...y
the     0   0   5   7   4   7
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few
symbols are coded in each context, making them
ineffective because of the zero-frequency
problem.
• Context saturation
– If there are too few contexts then the contexts
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear
combinations of previous data, but these may require
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
Page 3

1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of
probabilities of the next symbol as skewed as
possible
– After prediction there is no way to predict more so
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the
information the fewer bits are needed to code
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to
code a symbol in the model with a
i
having
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next
symbol
• Uses adaptive arithmetic coding for each
context
context a...e...i...r...s...y
the     0   0   5   7   4   7
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few
symbols are coded in each context, making them
ineffective because of the zero-frequency
problem.
• Context saturation
– If there are too few contexts then the contexts
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear
combinations of previous data, but these may require
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
3
CSEP 590 - Lecture 6 - Autumn 2007 13
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 14
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 15
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 16
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 17
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 18
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
Page 4

1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of
probabilities of the next symbol as skewed as
possible
– After prediction there is no way to predict more so
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the
information the fewer bits are needed to code
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to
code a symbol in the model with a
i
having
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next
symbol
• Uses adaptive arithmetic coding for each
context
context a...e...i...r...s...y
the     0   0   5   7   4   7
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few
symbols are coded in each context, making them
ineffective because of the zero-frequency
problem.
• Context saturation
– If there are too few contexts then the contexts
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear
combinations of previous data, but these may require
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
3
CSEP 590 - Lecture 6 - Autumn 2007 13
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 14
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 15
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 16
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 17
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 18
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
4
CSEP 590 - Lecture 6 - Autumn 2007 19
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2
0   1    2    3
c    b   a    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 20
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
0   1    2    3
c    b   d    a
CSEP 590 - Lecture 6 - Autumn 2007 21
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
Frequencies of {a, b, c, d}
a  b  c  d
4  7  8  1
Frequencies of {0, 1, 2, 3}
0  1  2  3
8  9  2  1
CSEP 590 - Lecture 6 - Autumn 2007 22
Extreme Example
Input:
aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd
Output
0000000000100000000020000000003000000000
Frequencies of a b c d
a   b   c   d
10 10 10 10
Frequencies of 0 1 2 3
0   1  2  3
37  1  1  1
CSEP 590 - Lecture 6 - Autumn 2007 23
Burrows-Wheeler Transform
• Burrows-Wheeler, 1994
• BW Transform creates a representation of the
data which has a small working set.
• The transformed data is compressed with
move to front compression.
• The decoder is quite different from the
encoder.
• The algorithm requires processing the entire
string at once (it is not on-line).
• It is a remarkably good compression method.
CSEP 590 - Lecture 6 - Autumn 2007 24
Encoding Example
• abracadabra
1. Create all cyclic shifts of the string.
0   abracadabra
1   bracadabraa
2   racadabraab
3   acadabraabr
4   cadabraabra
5   adabraabrac
6   dabraabraca
7   abraabracad
8   braabracada
9   raabracadab
10 aabracadabr
Page 5

1
CSEP 590
Data Compression
Autumn 2007
Predictive Coding
Burrows-Wheeler Transform
CSEP 590 - Lecture 6 - Autumn 2007 2
Predictive Coding
• The next symbol can be statistically predicted
from the past.
– Code with context
– Code the difference
– Move to front, then code
• Goal of prediction
– The prediction should make the distribution of
probabilities of the next symbol as skewed as
possible
– After prediction there is no way to predict more so
we are in the first order entropy model
CSEP 590 - Lecture 6 - Autumn 2007 3
Bad and Good Prediction
• From information theory – The lower the
information the fewer bits are needed to code
the symbol.
• Examples:
– P(a) = 1023/1024, inf(a) = .000977
– P(a) = 1/2, inf(a) = 1
– P(a) = 1/1024, inf(a) = 10
)
P(a)
1
( log inf(a)
2
=
CSEP 590 - Lecture 6 - Autumn 2007 4
Entropy
• Entropy is the expected number of bits to
code a symbol in the model with a
i
having
probability P(a
i
).
• Good coders should be close to this bound.
– Arithmetic
– Huffman
– Golomb
– Tunstall
)
) P(a
1
( )log P(a H
i
2
m
1 i
i ?
=
=
CSEP 590 - Lecture 6 - Autumn 2007 5
PPM
• Prediction with Partial Matching
– Cleary and Witten (1984)
– Tries to find a good context to code the next
symbol
• Uses adaptive arithmetic coding for each
context
context a...e...i...r...s...y
the     0   0   5   7   4   7
he      10  1   7   10  9   7
e       12  2   10  15  10  10
<nil>   50  70  30  35  40  13
good
CSEP 590 - Lecture 6 - Autumn 2007 6
JBIG
• Coder for binary images
– documents
– graphics
• Codes in scan line order using context from
the same and previous scan lines.
• Uses adaptive arithmetic coding with context
context
next bit to be coded
... ...
...
...
...
...
2
CSEP 590 - Lecture 6 - Autumn 2007 7
JBIG Example
0 0 0
0 0 0
0 0
0 0
next bit         0     1
frequency   100  10
0 1 1
0 1 1
0 1
1 0
next bit         0     1
frequency    15    50
.44 )
100
110
log(
110
100
)
10
110
log(
110
10
H = + =
.78 )
50
65
log(
65
50
)
15
65
log(
65
15
H = + =
CSEP 590 - Lecture 6 - Autumn 2007 8
Issues with Context
• Context dilution
– If there are too many contexts then too few
symbols are coded in each context, making them
ineffective because of the zero-frequency
problem.
• Context saturation
– If there are too few contexts then the contexts
might not be as good as having more contexts.
• Wrong context
– Again poor predictors.
CSEP 590 - Lecture 6 - Autumn 2007 9
Prediction by Differencing
• Used for Numerical Data
• Example: 2 3 4 5 6 7 8 7 6 5 4 3 2
• Transform to 2 1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
– much lower first-order entropy
CSEP 590 - Lecture 6 - Autumn 2007 10
General Differencing
• Let x
1
, x
2
, ..., x
n
be some numerical data that is
correlated, that is x
i
is near x
i+1
• Better compression can result from coding
x
1
, x
2
– x
1
, x
3
– x
2
, ... , x
n
– x
n-1
• This idea is used in
– signal coding
– audio coding
– video coding
• There are fancier prediction methods based on linear
combinations of previous data, but these may require
training.
CSEP 590 - Lecture 6 - Autumn 2007 11
Move to Front Coding
• Non-numerical data
• The data have a relatively small working set
that changes over the sequence.
• Example: a b a b a a b c c b b c c c c b d b c c
• Move to Front algorithm
– Symbols are kept in a list indexed 0 to m-1
– To code a symbol output its index and move the
symbol to the front of the list
CSEP 590 - Lecture 6 - Autumn 2007 12
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0
0   1    2    3
a   b    c    d
3
CSEP 590 - Lecture 6 - Autumn 2007 13
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 14
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 15
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 16
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1
0   1    2    3
a   b    c    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 17
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0
0   1    2    3
a   b    c    d
CSEP 590 - Lecture 6 - Autumn 2007 18
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1
0   1    2    3
b   a    c    d
0   1    2    3
a   b    c    d
4
CSEP 590 - Lecture 6 - Autumn 2007 19
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2
0   1    2    3
c    b   a    d
0   1    2    3
b   a    c    d
CSEP 590 - Lecture 6 - Autumn 2007 20
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
0   1    2    3
c    b   d    a
CSEP 590 - Lecture 6 - Autumn 2007 21
Example
• Example: a b a b a a b c c b b c c c c b d b c c
0 1 1 1 1 0 1 2 0 1 0 1 0 00 1 3 1 2 0
Frequencies of {a, b, c, d}
a  b  c  d
4  7  8  1
Frequencies of {0, 1, 2, 3}
0  1  2  3
8  9  2  1
CSEP 590 - Lecture 6 - Autumn 2007 22
Extreme Example
Input:
aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd
Output
0000000000100000000020000000003000000000
Frequencies of a b c d
a   b   c   d
10 10 10 10
Frequencies of 0 1 2 3
0   1  2  3
37  1  1  1
CSEP 590 - Lecture 6 - Autumn 2007 23
Burrows-Wheeler Transform
• Burrows-Wheeler, 1994
• BW Transform creates a representation of the
data which has a small working set.
• The transformed data is compressed with
move to front compression.
• The decoder is quite different from the
encoder.
• The algorithm requires processing the entire
string at once (it is not on-line).
• It is a remarkably good compression method.
CSEP 590 - Lecture 6 - Autumn 2007 24
Encoding Example
• abracadabra
1. Create all cyclic shifts of the string.
0   abracadabra
1   bracadabraa
2   racadabraab
3   acadabraabr
4   cadabraabra
5   adabraabrac
6   dabraabraca
7   abraabracad
8   braabracada
9   raabracadab
10 aabracadabr
5
CSEP 590 - Lecture 6 - Autumn 2007 25
Encoding Example
2. Sort the strings alphabetically in to array A
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
0   abracadabra
1   bracadabraa
2   racadabraab
3   acadabraabr
4   cadabraabra
5   adabraabrac
6   dabraabraca
7   abraabracad
8   braabracada
9   raabracadab
10 aabracadabr
A
CSEP 590 - Lecture 6 - Autumn 2007 26
Encoding Example
3. L = the last column
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
L = rdarcaaaabb
CSEP 590 - Lecture 6 - Autumn 2007 27
Encoding Example
4. Transmit X the index of the input in A and L (using
move to front coding).
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
L = rdarcaaaabb
X = 2
CSEP 590 - Lecture 6 - Autumn 2007 28
Why BW Works
• Ignore decoding for the moment.
• The prefix of each shifted string is a context
for the last symbol.
– The last symbol appears just before the prefix in
the original.
• By sorting, similar contexts are adjacent.
– This means that the predicted last symbols are
similar.
CSEP 590 - Lecture 6 - Autumn 2007 29
Decoding Example
• We first decode assuming some information.  We
then show how to compute the information.
• Let A
s
be A shifted by 1
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
0  raabracadab
1  dabraabraca
2  aabracadabr
3  racadabraab
4  cadabraabra
5  abraabracad
6  abracadabra
7  acadabraabr
8  adabraabrac
9  braabracada
10 bracadabraa
A
s
CSEP 590 - Lecture 6 - Autumn 2007 30
Decoding Example
• Assume we know the mapping T[i] is the index in A
s
of the string i in A.
• T = [2 5 6 7 8 9 10 4 1 0 3]
0  aabracadabr
1  abraabracad
2  abracadabra
3  acadabraabr
4  adabraabrac
5  braabracada
6  bracadabraa
7  cadabraabra
8  dabraabraca
9  raabracadab
10 racadabraab
A
0  raabracadab
1  dabraabraca
2  aabracadabr
3  racadabraab
4  cadabraabra
5  abraabracad
6  abracadabra
7  acadabraabr
8  adabraabrac
9  braabracada
10 bracadabraa
A
s
```
Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!