Practical Algorithms

signals

Image Correlations and Context

In a zero-memory source pixels are independently drawn from probability distribution.

$H(X) = \sum_{i}p_i log_2 \frac{1}{p_i}$

H only depends on the image histogram. Randomising pixel locations leads to no change in the entropy.

i.e. the context of the image is ignored. H derived from a histogram makes no account of structure

Context

The context can be defined as the neighbourhood of a pixel.

Spatial redundancy in images is an example of the context in a general signal.
The ‘zero memory source’ model assumed all pixels are independent and therefore have no context.
Images do not generally have this property

An example of context

Using a simple non-image example, consider the stream abcaaabcaabc.

It contains 3 symbols, occurring with probabilities 0.5, 0.25,0.25

H = 1.5 bits: Shannon theorem says we can encode losslessly in 18 bits

b and c always occur in sequence, so a more complex model would suggest that we effectively have only 2 symbols, a & bc with probabilities 0.66 and 0.33

H = 0.918 bits => can store 9 symbols using 8.26 bits

In order to make this reduction we have to recognise that there are correlations and modify the probabilistic model appropriately.

Note: this could not be derived from an image histogram

Correlations

There is a problem with correlations within a signal as the resulting Entropy assumes statistical independence is an overestimate. The ’optimal’ codes calculated using this probability distribution do not achieve the best possible compression.

Note: this is NOT a failure of Shannon’s theorem, it is a failure of our method of computing the probability distribution and hence Entropy

However these problems can be overcome by:

Updating the model for computing probability
Removing correlation from the signal using some other algorithm and compressing the result with an optimal code.

Conditional Arithmetic Coding

Stream codes such as Arithmetic codes lend themselves to context dependent probability models. They define probabilities not just of each symbol, but also each symbol given the previous one.

Run Length Encoding (RLE)

This is one of the simplest context based compression algorithm, and is useful if an image contains multiple adjacent pixels with the same value. As it replaces each run of N adjacent pixels with the same value V with the pair {V,N}. For example:

Before Encoding	After Encoding
aaabbaaabbaacccaaa	{a,3}{b,2}{a,3}{b,2}{a,2}{c,3}{a,3}

This is particularly useful for images quantised with low number of levels (i.e. pixels can take only a few different values) since the chances of long runs are high. For example:

10	10	10	10
64	64	64	64
120	120	120	120
253	253	253	253

The above ‘4x4 image’ can be written row-wise, and then be compressed with RLE to:

Before Encoding	After Encoding
10-10-10-10-64-64-64-64-120-120-120-120-253-253-253-253	10-4-64-4-120-4-253-4
Assuming image dimensions are known

RLE is able to halve the number of bits in this case from 32 to 16. And after RLE, Huffman/Arithmetic coding can better compress the result. However this greatly depends on the context. If there are many levels and large variations between each pixel then RLE can expand the data rather than compress it.

Lempel-Zin-Welch (LZW) Coding

This finds the most commonly repeated sequence of characters and extends the dictionary to include these as new symbols. It reads sequentially, and buffers input characters into sequence S until S + next character is not in the dictionary. Output code for S then S + next character to the dictionary. Re-start buffering with the next character.

The advantages of LZW:

More sophisticated than RLE and the code is updated to reflect spatial redundancy
Encoding is done on the fly and no probability distribution is required

For example if the following 4x4 block is compressed with LZW:

39	39	126	126
39	39	126	126
39	39	126	126
39	39	126	126

return link

Written by Tobias Whetton