Press "Enter" to skip to content

From seductive theory to concrete applications

Lossy commpressed Lena - This image file is in JPEG!

Many mathematicians believe that in concrete numerical applications such as image, sound, and video compression, wavelets have superseded the more classical Fourier transform. This is not really the case, unfortunately, for various reasons, ranging from non competitive implementations to poor standardizations. A symptomatic example is given by the JPEG image format for lossy compression. Our .jpg or .jpeg files are based on the following algorithm: the image is partitioned into square blocs, and on each bloc, a spatial discrete Fourier transform is computed (actually a Discrete Cosine Transform, DCT), a frequency filter is then applied to kill hardly visible frequencies (this is the lossy part), and finally a lossless algorithm (Huffman) is used to compress the result. Regarding sound compression, our .mp3 music files (actually MPEG-1 MPEG-2 audio layer 3) are based on similar ideas. At the end of the past century, a new JPEG format called JPEG2000 was designed, involving an algorithm in which the discrete Fourier transform is replaced by a discrete wavelet transform and in which the lossless final step is performed with more efficient entropic coding. JPEG2000 is not the sole image format based on wavelets, and one can mention the more specialized ECW format for instance. Sadly, JPEG2000 has not reached the expected success, and very few people are using .jp2|.jpx files.

Regarding wavelets for video compression, the situation is not really better. One may read the blog post dated 2010-02-26 by Jason Garrett-Glaser, an x264 developer. Recall that x264 is a free software library used in the VideoLAN projet (VLC media player!) to encode H.264 video streams (MPEG-4 AVC). In particular, he ends his post by saying:

« JPEG2000 is a classic example of wavelet failure: despite having more advanced entropy coding, being designed much later than JPEG, being much more computationally intensive, and having much better PSNR, comparisons have consistently shown it to be visually worse than JPEG at sane filesizes.  By comparison, H.264′s intra coding, when used for still image compression, can beat JPEG by a factor of 2 or more (I’ll make a post on this later).  With the various advancements in DCT intra coding since H.264, I suspect that a state-of-the-art DCT compressor could win by an even larger factor. Despite the promised benefits of wavelets, a wavelet encoder even close to competitive with x264 has yet to be created.  With some tests even showing Dirac losing to Theora in visual comparisons, it’s clear that many problems remain to be solved before wavelets can eliminate the ugliness of block-based transforms once and for all. »

This shows the long path from seductive theory to concrete applications. We will see in the forthcoming decades if the wavelets paradigm will beat optimized Fourier transforms. Data compression is a rapidly evolving and quite active domain lying between applied mathematics, computer science, and information technology. Finally, the described phenomenon is quite common in applied mathematics, and is not specific to wavelets. Other instances are for example Model Selection in Statistics and Compressive Sensing in Information Technology.

Note: Every practitioner knows that comparing algorithms on datasets is a tricky task. You may also read How to cheat on video encoder comparisons, by J. Garrett-Glaser, dated 2010-06-21.

2 Comments

  1. Djalil Chafaï 2012-03-01

    Thanks, Igor. It’s a long path for many things indeed. In pure technology for instance: transistor versus vacuum tube, flash memory versus hard drive, etc. As a probabilist, I’m also surprised to see how many colleagues are still teaching the Box-Muller algorithm and diagonalization for simulating gaussian distributions, instead of Marsaglia’s algorithm and Choleski’s factorization.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Syntax · Style · .