Discrete cosine transform. Algorithms, advantages, applications.

*(English)*Zbl 0726.65162
Boston, MA: Academic Press, Inc. xviii, 490 p. £43.95; $ 69.95 (1990).

The discrete cosine transform (DCT) belongs to the family of transforms like the discrete Fourier transform (DFT) or the discrete Hartley transform (DHT) which are not only closely connected by definition but also widely used in various fields of digital signal processing. Although various DCT-based applications exhibit a dramatic growth, no book on DCT has emerged so far. This book is aimed at filling this gap. It is organized into seven chapters with main stress laid on computational aspects of the DCT (fast algorithms both in one and two dimensions) and with the last extensive chapter which is unique in giving an exhaustive overview of various DCT applications.

The exposition starts with some historical and introductory remarks in Chapter 1 (6 pages), giving a guideline through the chapters as well. The basic concepts of DCT are presented in Chapter 2 (20 pages) and related to the continuous Fourier cosine transform. Four DCT definitions (DCT-I, DCT-II, DCT-III and DCT-IV) are presented which are due to Z. Wang [IEEE Trans. Acoust. Speech Signal Process. ASSP-32, 803-816 (1984; Zbl 0577.65134)], and their basic mathematical properties analogical to those of the DFT or DHT are briefly discussed (unitarity, inversion, scaling in time, shift in time, difference and convolution property).

Chapter 3 (21 pages) deals with the Karhunen-Loeve transform (KLT) which is a statistically optimal series representation of a given random signal. The KLT matrix is formed from the eigenvectors diagonalizing the associated auto-covariance matrix. It is shown in detail that each of the DCT types is in a certain sense asymptotically equivalent to KLT of a stationary Markov-I signal. The problem is handled also from a more general point of view: a general procedure for generating discrete unitary transforms that asymptotically diagonalize a given class of signal covariance matrices is described - an important consequence being that DCT-II diagonalizes the Toeplitz class of matrices, and thus actually proving that DCT-II is asymptotic to the KLT for the Markov signal of all orders.

Chapter 4 (40 pages) examines various approaches to the construction of fast computational algorithms for the one-dimensional (1-D) DCT-II which are seen to be a parallel of those of the DFT. They may be classified as DCT via FFT, sparse matrix factorization, DIT and DIF algorithms, prime factor algorithms (PFAs), algorithms based on different radices, index mapping, a fast recursive algorithm, and DCT via other transforms. A table is appended briefly summarizing the computational complexity of the algorithms and commenting their advantages and disadvantages.

In Chapter 5 (34 pages) several fast algorithms are detailed which are designed to implement the two-dimensional (2-D) DCT. The methods may be characterized as 2-D DCT via Kronecker product decomposition into 1-D DCTs, 2-D DCT via 2-D DFT, and 2-D DCT via Walsh-Hadamard transform (WHT). As in Chapter 4 a table summarizes the computational efficiency of the different algorithms. Also some hardware implementations of 2-D DCT are discussed.

The object of Chapter 6 (14 pages) is to compare DCT-II with other well- known discrete transforms on the basis of some standard performance criteria such as variance distribution, residual correlation, scalar Wiener filtering, and rate distortion. This comparison shows that DCT, in general, is superior to all the other suboptimal discrete transforms. These favourable properties evoke a wide spectrum of DCT applications which are covered by Chapter 7 (213 pages). In most cases, these applications are described conceptually rather than in detail, and supported by extensive references to the relevant literature. The discussed applications involve, among others, filtering, image coding, low bit-rate coding, data compression, pattern recognition, etc., and cover diverse disciplines such as speech, video, printed image, medical imaging, transmultiplexers, surface texture and many others.

The book concludes with a collection of well-documented computer programs (appendices A.1 through A.11, 90 pages) written in FORTRAN or C. Additional appendices B.1 through B.3 (11 pages) provide lists of manufacturers of DCT chips, image compression boards, and motion estimation chips. Also lists of acronyms and symbols as well as an index of cross-references are included. At the end of the book an extensive list of references (33 pages) categorized according to disciplines is added allowing the reader a detailed exploration of the literature. Each of chapters 2 through 7 is followed by unsolved problems which not only practise but also extend the material of the relevant chapter. Many illustrations such as flowgraphs, block diagrams and (even coloured) pictures significantly support the text which is written in a clear and intelligible style.

Although this excellent book is primarily aimed at the graduate student, it may appeal also to a wide audience in the engineering, scientific, and technical community, thus both stimulating further research in the field and giving rise to new DCT applications.

The exposition starts with some historical and introductory remarks in Chapter 1 (6 pages), giving a guideline through the chapters as well. The basic concepts of DCT are presented in Chapter 2 (20 pages) and related to the continuous Fourier cosine transform. Four DCT definitions (DCT-I, DCT-II, DCT-III and DCT-IV) are presented which are due to Z. Wang [IEEE Trans. Acoust. Speech Signal Process. ASSP-32, 803-816 (1984; Zbl 0577.65134)], and their basic mathematical properties analogical to those of the DFT or DHT are briefly discussed (unitarity, inversion, scaling in time, shift in time, difference and convolution property).

Chapter 3 (21 pages) deals with the Karhunen-Loeve transform (KLT) which is a statistically optimal series representation of a given random signal. The KLT matrix is formed from the eigenvectors diagonalizing the associated auto-covariance matrix. It is shown in detail that each of the DCT types is in a certain sense asymptotically equivalent to KLT of a stationary Markov-I signal. The problem is handled also from a more general point of view: a general procedure for generating discrete unitary transforms that asymptotically diagonalize a given class of signal covariance matrices is described - an important consequence being that DCT-II diagonalizes the Toeplitz class of matrices, and thus actually proving that DCT-II is asymptotic to the KLT for the Markov signal of all orders.

Chapter 4 (40 pages) examines various approaches to the construction of fast computational algorithms for the one-dimensional (1-D) DCT-II which are seen to be a parallel of those of the DFT. They may be classified as DCT via FFT, sparse matrix factorization, DIT and DIF algorithms, prime factor algorithms (PFAs), algorithms based on different radices, index mapping, a fast recursive algorithm, and DCT via other transforms. A table is appended briefly summarizing the computational complexity of the algorithms and commenting their advantages and disadvantages.

In Chapter 5 (34 pages) several fast algorithms are detailed which are designed to implement the two-dimensional (2-D) DCT. The methods may be characterized as 2-D DCT via Kronecker product decomposition into 1-D DCTs, 2-D DCT via 2-D DFT, and 2-D DCT via Walsh-Hadamard transform (WHT). As in Chapter 4 a table summarizes the computational efficiency of the different algorithms. Also some hardware implementations of 2-D DCT are discussed.

The object of Chapter 6 (14 pages) is to compare DCT-II with other well- known discrete transforms on the basis of some standard performance criteria such as variance distribution, residual correlation, scalar Wiener filtering, and rate distortion. This comparison shows that DCT, in general, is superior to all the other suboptimal discrete transforms. These favourable properties evoke a wide spectrum of DCT applications which are covered by Chapter 7 (213 pages). In most cases, these applications are described conceptually rather than in detail, and supported by extensive references to the relevant literature. The discussed applications involve, among others, filtering, image coding, low bit-rate coding, data compression, pattern recognition, etc., and cover diverse disciplines such as speech, video, printed image, medical imaging, transmultiplexers, surface texture and many others.

The book concludes with a collection of well-documented computer programs (appendices A.1 through A.11, 90 pages) written in FORTRAN or C. Additional appendices B.1 through B.3 (11 pages) provide lists of manufacturers of DCT chips, image compression boards, and motion estimation chips. Also lists of acronyms and symbols as well as an index of cross-references are included. At the end of the book an extensive list of references (33 pages) categorized according to disciplines is added allowing the reader a detailed exploration of the literature. Each of chapters 2 through 7 is followed by unsolved problems which not only practise but also extend the material of the relevant chapter. Many illustrations such as flowgraphs, block diagrams and (even coloured) pictures significantly support the text which is written in a clear and intelligible style.

Although this excellent book is primarily aimed at the graduate student, it may appeal also to a wide audience in the engineering, scientific, and technical community, thus both stimulating further research in the field and giving rise to new DCT applications.

Reviewer: V.VeselĂ˝ (Brno)

##### MSC:

65T50 | Numerical methods for discrete and fast Fourier transforms |

65-02 | Research exposition (monographs, survey articles) pertaining to numerical analysis |

42-04 | Software, source code, etc. for problems pertaining to harmonic analysis on Euclidean spaces |

42A38 | Fourier and Fourier-Stieltjes transforms and other transforms of Fourier type |

94-04 | Software, source code, etc. for problems pertaining to information and communication theory |

68T10 | Pattern recognition, speech recognition |

68U10 | Computing methodologies for image processing |

68U99 | Computing methodologies and applications |

94A12 | Signal theory (characterization, reconstruction, filtering, etc.) |