LEA
Department of Computer Science at the Technische Universität München
Chair for Efficient Algorithms
Postal address: 80290 München; Premises: Arcisstr.21, 80333 München
deutsch

Data Compression (WS 99/00)


* Lecturer:
Prof. Dr. Sami Khuri

* Area:
3 lectures per week in area III (Theoretical Computer Science)
course admitted for examination , topic algorithms

* Time and Place:
Tuesdays 13:00 - 15:00, lecture hall S1128
Thursdays 13:00 - 14:00, lecture hall S1128

* No tutorials.

* Audience:
graduate students of computer science

* Prerequisites:
1st and 2nd year courses
The lectures will be in English.

* Recommended for:
In-depth knowledge in topic Algorithms

* Contents:
The course will cover the main data compression algorithms and their applications. Regardless of whether the data to be compressed is text, image (black and white, or color; still or moving), or audio, it is first divided into small constituents, such as characters for text data and pixels for images. In essence, data compression exploits the inherent redundancy present in the original data to be compressed, so as to reduce (compress) it to a form which requires less space than the original data. There are two kinds of compressions: lossless, where the exact, original data has to appear once uncompressed, and lossy, where some of the data might be lost. The course will cover lossless compression in more details than lossy compression.
Under lossless compression, we will study statistical-based techniques, such as Huffman, adaptive Huffman, and arithmetic, dictionary-based techniques, such as LZ77, LZ78, and LZW, and run-length encoding, such as facsimile encoding. Applications include the ``compress'' command in UNIX, Compuserve' GIF, fax transmission, and a number of compression packages that use variations of Lempel and Ziv's pioneering work. We shall also study several algorithms for compressing bitmap images, such as quadtree compression and space-filling algorithms. The latter, such as Hilbert curves, sometimes achieve better compression ratios than quadtree and run length encoding.
Under lossy compression, the course will basically cover the image compression algorithm behind JPEG. Following some basic concepts on color spaces, such as RGB, CMY and HSI, and the reasons for which traditional methods do not work well with either grayscale or continuous-tone color images, the course will introduce transform coding (mainly the discrete cosine transform) and quantization. The various steps of JPEG will then be studied.
  1. Introduction to Data Compression
    • What is Data Compression?
    • Why is it important?
  2. Lossless Data Compression
    • Statistical Methods
    • Huffman Coding
    • Shannon-Fano Coding
    • First Elias Code
    • Second Elias Code
    • Fibonacci Codes
    • Adaptive Huffman Coding
    • Arithmetic Coding
    • Dictionary-Based Codes
    • LZ77, LZ78 and LZW
    • Burrows-Wheeler Algorithm
    • Run-Length Encoding
    • Facsimile Encoding
    • Quadtree Compression
    • Space-Filling Curves
    • Delta Encoding
    • Linear Predictive Coding
  3. Lossy Data Compression
    • Dynamic Window-Based RLE
    • Block Truncation Coding
    • Scalar Quantization
    • Vector Quantization
    • Differential Encoding
    • Transform Coding
    • Introduction to Color Spaces
    • JPEG

* Related and Advanced Lectures:
Efficient Algorithms and Datastructures I
Efficient Algorithms and Datastruktures II

* Some Examples:
Burrows-Wheeler Algorithm
Burrows-Wheeler Encoding of "mississippi"

* Problem Sets:
Problem Set 1 (Postscript).
Problem Set 2 (Postscript).

* References:
David Salomon:
Data Compression. The Complete Reference
Springer Verlag, 1998
Khalid Sayood:
Introduction to Data Compression
Morgan Kaufmann Publishers, Inc. San Francisco, California, 1996.
Jerry Gibson, Toby Berger, Tom Lookabaugh, Dave Lindbergh, Richard Baker:
Digital Compression for Multimedia. Principles and Standards
Morgan Kaufmann Publishers, Inc., San Francisco, California, 1998.

* Office Hours:
see here


khuri@in.tum.de