
Digital Video Compression Part 6

General description

DCT is a transformation widely used in image compression. The static JPEG compression standard, the H.263 video conferencing standard, and the MPEG digital video standards (MPEG-1, MPEG-2, and MPEG-4) use DCT. These standards use, in particular, two-dimensional DCT, applied sequentially to 8 x 8 pixel image blocks. DCT calculates 64 (8×8 = 64) coefficients, which are then quantized, thus providing true compression. In most images, most DCT coefficients, due to their smallness, are zeroed after quantization. This property of DCT is the basis for many compression algorithms that use DCT.
Furthermore, the human eye is known to be much less sensitive to high-frequency image components represented by large DCT coefficients. At these larger values of the coefficients, a larger quantization factor can (and usually does) be applied. In particular, the matrix of 64 quantization factors for each of the 64 DCT coefficients used in the JPEG algorithm has large quantization factors for the higher frequency DCT coefficients, respectively. Once quantized, the coefficients are subjected to the RLE algorithm. In addition, for frequent combinations, short code words are used, for the rarer ones, relatively long. A probabilistic coding is carried out.
DCT, in turn, is best explained with the example of a one-dimensional DCT. Two-dimensional DCT is a one-dimensional DCT that is applied sequentially for each row (row) of a block of pixels and each column of a block of pixels obtained from a one-dimensional DCT of rows. One-dimensional DCT applied to N samples (pixels in an image or samples in an audio file). DCT is an NxN matrix whose rows are cosine functions:
DCT (m, n) = sqrt ((1 – delta (m, 1)) / N) * cos ((pi / N) * (n – 1/2) * (m-1))
where
DCT (m, n) is a one-dimensional DCT matrix
m, n = 1, …, N
pi = 3.14159267 …
N = number of samples per block
delta (m, 1) = 1 if m = 1 and 0 otherwise
cos (x) = cosine of x, measured in radians.
Naturally, using DCT in a block of N samples will require N * N multiplication and addition operations. However, due to the recursive structure of the DCT array, actually much fewer math operations are required, namely N log (N). This property makes DCT really applicable in modern personal computer math processors.
At the beginning At the beginning
Discrete Wavelet Transform (DWT)
Compressors using DWT (Discrete Wavelet Transform): Intel Indeo 5.x; Intel Indeo 4.x
Advantages and disadvantages
Most of the static and dynamic images compressed using the DWT algorithm do not have the characteristic block structure of the DCT algorithm.
The relative quality of DWT compressed images is superior to DCT compressed images with the same compression ratios.
DWT smudges a bit, rounds out the sharp outlines of the image. The so-called edge noise or Gibbs effect.
General description
The DWT algorithm is based on the transmission of a signal, such as an image, through a pair of filters: low pass and high pass. A low pass filter produces a coarse waveform of the original signal. A high pass filter produces a difference signal or additional detail.
In turn, the result at the output of the high-pass filter (additional detail signal) can be subjected to the same procedure, and so on.
A simple example of DWT is DWT Hara:
The input signal x [n] is a plurality of samples with index n. Haar’s low-pass filter is the arithmetic average of two successful samples:
g [n] = 1/2 * (x [n] + x [n + 1])
Haar’s high-pass filter is the average difference of two successful samples:
h [n] = 1/2 * (x [n + 1] – x [n])
Note that:
x [n] = g [n] – h [n] x [n + 1] = g [n] + h [n]
The output sequences g [n] and h [n] contain redundant information. Therefore, it is clear that to reproduce the original signal x [n], it is sufficient to take only odd or even samples. As a rule, even samples are taken. Therefore, the original signal x [n] is obtained only from: g [0], g [2], g [4], …. h [0], h [2], h [4], .. …
x [0] = g [0] – h [0]
x [1] = g [0] + h [0] x [2] = g [2] – h [2] x [3] = g [2] + h [2] and so on …
The low pass filter output, as noted, is a rough analogy to the original signal. If the original signal is an image, the low-pass filter output will produce a blurry, fuzzy, and low-resolution image. The high-frequency signal output adds detail to the image. In combination with the low-pass filter output, the original image can be reproduced in this way.















