Video, Image, and Audio Processing

by Dr. Kenneth K.M. Lam
Department of Electronic and Information Engineering
The Hong Kong Polytechnic University


Suject Code: EIE 425

Objective:

To provide a broad treatment of the fundamentals of speech, image, audio and video processing.

 

 

Syllabus:
  1. Speech processing
    1.1 Physiology of speech generation: characteristic of speech sounds; glottal excitation; speech production models: discrete time speech production model; discrete time filter model for speech production; source excitation model.
    1.2 Linear prediction analysis: All-pole models; least-squares estimation; spectral matching; spectral envelopes; applications of LP analysis.
    1.3 Speech coding: Coder's attributes; waveform coding; vocoders; analysis-by-synthesis coding; code-excited linear predictive vocoder; regular pulse-excited LPC.
  2. Image processing
    2.1 Fundamentals of digital image: Digital image representation and visual perception, image sampling and quantization.
    2.2 Image enhancement: Histogram processing; Median filtering; Low-pass filtering; High-pass filtering; Spatial filtering; Linear interpolation, Zooming.
    2.3 Image coding and compression techniques: Scalar and vector quantizations; Codeword assignment; Entropy coding; Transform image coding; Wavelet coding; Codec examples.
    2.4 Image analysis and segmentation: Feature extraction; Histogram; Edge detection; Thresholding.
    2.5 Image representation and description: Boundary descriptor; Chaincode; Fourier descriptor; Skeletonizing; Texture descriptor; Moments.
  3. Audio processing
    3.1 Fundamentals of digital audio: Sampling; Dithering; Quantization; psychoacoustic model.
    3.2 Basic digital audio processing techniques: Anti-aliasing filtering; Oversampling; Analog-to-digital converion; Dithering; Noise shaping; Digital-to-analog Conversion; Equalisation.
    3.3 Digital Audio compression: Critical bands; threshold of hearing; Amplitude masking; Temporal masking; Waveform coding; Perceptual coding; Coding techniques: Subband coding and Transform coding; Codec examples.
  4. Video processing
    4.1 Fundamentals of digital video: Basics of digital video; Digital video formats.
    4.2 Basic digital video processing techniques: Motion estimation; Interframe filtering; Motion-compensated filtering; Error concealment.
    4.3 Video coding techniques: Temporal redundancy; Spatial redundancy; Block-based motion estimation and compensation; Coding techniques: Model-based coding, Motion-compensated waveform coding; Codec examples.

Please click here to download the syllabus.

 
Notes:
  1. Introduction to Digital Image Processing
  2. Image Enhancement (Supplement)
  3. Image Coding and Compression Techniques (Supplement)
  4. Image Analysis and Segmentation
  5. Image Representation and Description
  6. Video Fundamental
  7. MPEG Video Coding

Papers:
1. Vector Quantization (LBG algorithm for codebook training)
2. JPEG Standard
3. MPEG Digital Video Coding Standards, IEEE Signal Processing Magazine,1997
4. Digital Video Coding Standards, IEEE Proceedings, 1995

 
Tutorials:

Tutorial 1 : Image Fundamentals and Enhancement
Tutorial 2 : Image Compression
Tutorial 3 :Image Segmentation (solution) *New*
Tutorial 4 : Video Processing (solution) *New*

 
Flash Animations:

Dot Product or Correlation
Discrete Cosine Transform
Codebook Training
Motion Estimation

 
Assignment:

Assignment 1: To be submitted by 3 October 2006 (solution) *New*

 
Laboratory:

Laboratory 1: Digital Image Processing Using MATLAB
You are required to attend either one of the following two sessions
1. 23 September (Saturday), 2:00pm - 5:00pm
2. 26 September (Tuesday), 9:30am - 12:30pm
Additional hours: (1) 3 October 2006 (Tuesday), 6:30pm - 9:30pm (2) 5 October 2006 (Thursday), 6:30pm - 9:30pm
Room: CF105 and CF105a

The submission deadline has been extended to 20 October 2006 (Friday).

Laboratory 2: Analysis of MPEG video (Definition of the MB)
Date: 24 and 28 October 2006 (2:00pm - 5:00pm)
To be submitted on or before 20 November 2006

 
Announcement:

Consultation Hours:
Every Monday: 7:00pm - 8:30pm
Tutors: Tse Siu-Hong Thomas, Koo Hei-Sheung, and Choi Wing-Pong (DE503)

Test 1:
Date: 22 September 2006 (This Friday)

Our next two lectures will be held on
1. 20 October 2006 (Friday)
2. 21 October 2006 (Saturday) Time: 2:00pm - 5:00pm, Room HJ305

Supplementary Test 1: (solution)*New
If your score for Test 1 is lower than 60 or you were absent from the test, you may attend this supplementary test
Date: 28 October (Saturday)
Time: 1:00pm - 2:00pm
Place: CF105

Test 2: You may choose either one of the following two sesssions (solution)*New
Date: 4 November 2006 (Saturday)
Time: 2:30pm - 4:30pm
Place: TU101
Date: 7 November 2006 (Tuesday)
Time: 10:00am - 12:00noon
Place: CF105

Topics to be covered: From "Lossy Compression" to the end of "Video Coding"

Revision and Additional Tutorial: (Please click here to download the revision list and here for the Supplementary Test 2)
We will do a revision, and then a tutorial, which is open to all students. You are supposed to attend the revision part, which will last for 30 minutes. However, if you want to seat for the supplementary test 2, you must also attend the tutorial. After the tutorial, you will be given a take-home test paper. You must hand in this paper on or before 4 December (Monday).
Date: 2 December (Saturday)
Time: 3:00pm
Place: TU107
Please click here to download the solution of Supplementary Test 2 (**New**)
{N.B. Don't spend too much time on this solution. Take a brief look only!}

If you are not available for the revision on 2 December (Saturday), you may attend another revision session to be held as follows:
Date: 5 December 2006 (Tuesday)
Time: 10:30am
Place: CF105

Additional Consultation Hours:
1. 2 November (Thursday), Time: 7:00pm - 8:30pm, Place: CD634
2. 6 November (Monday), Time: 7:00pm - 8:30pm, Place: CD634
3. 5 December (Tuesday), Time: 7:30pm - 9:00pm, Place: CD634
4. 7 December (Thursday), Time: 7:30pm -9:00pm, Place: CD634

Useful links:

A gateway to

Comparison of Image file formats: PNG, JPG, TIF & GIF

Image Databases:

Benchmark for data compression (not just images)

 
Teaching schedule: Please click here to obtain the teaching schedule.