Selected Projects of My Undergraduate Students

BEng(Hons) Projects

  1. Artificial Reverberator
  2. Speech Signal Analyzer
  3. 3-D Sound Based on Head-Related Transfer Function (Work for Microsoft IE only)
  4. Online Ticketing

BScIT(Hons) Projects

    1. Restaurant Management System I (video demo, presentation)
    2. Restaurant Management System II (video demo 1, video demo 2, presentation)
    3. Spoken Dialog System for Facility Booking (demo)
    4. Internet Phone
    5. Voice over IP Gateway
    6. Voice mixing for Internet Telephony

 

Proposed Undergraduate Projects for 2003/04

Project 1: An Interactive Voice Response System Based on VoiceXML and Microsoft .NET Speech SDK

Large-vocabulary continuous speech recognition (LVCSR) has a wide range of applications. With the recent advance in speech and computer technologies, building an LVCSR system on personnel computer has become possible. In this project, you will develop a three-tier interactive voice response system (IVRS) based on VoiceXML and Microsoft .NET Speech SDK. The software should allow its users to reserve movie tickets over the telephone.

Pre-requisite: Strong in programming

What will you learn: XML, VoiceXML, ASP.NET, .NET, speech recognition

 

Project 2: Speaker Adaptation for Continuous Speech Recognition using HTK Toolkit

Large-vocabulary continuous speech recognition (LVCSR) has a wide range of applications. With the recent advance in speech and computer technologies, building an LVCSR system on personnel computers has become possible. In this project, you will develop a continuous speech recognition system based on the Cambridge University’ HTK Toolkit (http://htk.eng.cam.ac.uk). The software will be run on Linux platforms and it should be able to transcribe continuous speech into text. It also be able to adapt the speech models to accommodate the variation in speaker characteristics.

Pre-requisite:  C/C++, Unix and signal processing

What will you learn: Speech recognition, speaker adaptation, language model, HMM

 

Project 3: On-Line Ticket Ordering System Based on J2ME and Java Servlets

An on-line ticket ordering system (http://158.132.151.129:8080/cinema/home) has been developed for the subject EIE420 “Software Engineering for Web Applications”. The system is based on the Java Servlet technology and it allows users to order movie tickets via Web browsers running on client PCs. This project is to extend the system so that PDAs and mobile phones with Internet connection can also access the system. To this end, MIDlets will be developed and downloaded to PDAs or mobile phones, and existing servlets will be modified to accommodate the limited computing power of handheld devices.

Pre-requisite: Java, servlets, Unix, Web programming

What will you learn: J2ME, servlets, Web programming

 

Project 4: An Interactive Software Package for Learning Speech Processing

Multimedia teaching tools have been widely used in many universities to help students understand abstract concepts and theories. This project aims to add some new features to a Windows-based multimedia learning tool (http://www.eie.polyu.edu.hk/~mwmak/Download.htm) to help students learn the concepts of audio and speech processing. The new features include displaying formant tracks and pitch envelopes. Visual C++ and Microsoft Foundation Class (MFC) libraries will be used in this project. The current version of the software can be found in http://www.en.polyu.edu.hk/~mwmak/Download.htm.

Pre-requisite: C/C++, Visual C++, and DSP

What will you learn: Speech processing techniques, Microsoft MFC, multi-threading programming, and audio programming. 

 

 

Project 5: A Multimedia Software Tool for Speech Coder Design

Multimedia teaching tools have been widely used in many universities to help students understand abstract concepts and theories. This project aims to add some new features to a Windows-based multimedia learning tool to help students learn the concepts of speech coding. The software tool should allow users to change the parameters of speech coders through user interface controls. Visual C++ and Microsoft Foundation Class (MFC) libraries will be used in this project. The current version of the software can be found in http://www.en.polyu.edu.hk/~mwmak/Download.htm.

Pre-requisite: C/C++, Visual C++, and DSP

Knowledge and skills to be learnt: Speech coding techniques, Microsoft MFC, multi-threading programming, and audio programming. 

 

Group Project 6&7: Client/Server Architecture for Distributed Speaker Verification

The European Telecommunications Standards Institution (ETSI) has recently published a front-end processing standard for distributed speech recognition. The standard allows speech features to be extracted from handheld devices and transmitted to remote servers for recognition. This project is divided into two parts. In the first part, you will apply the ETSI standard to implement the front-end of a distributed speaker verification system through which users’ identities can be authenticated over the IP and wireless networks. In the second part, you will develop an on-line speaker verification system based on Gaussian mixture speaker models and support vector machines.

Pre-requisite: C/C++, DSP concepts, Unix

Knowledge and skills to be learnt: Distributed systems, distributed speaker recognition, audio programming, Gaussian mixture models, support vector machines. 

 

Project 8: Fixed-point Implementation of G.723.1 Speech Coder

G.723.1 is a speech compression algorithm standardized by International Telecommunication Union (ITU) for multimedia, visual telephony, wireless telephony, and videoconferencing products. The coder delivers the highest compression ratio of any of the current ITU standards without compromising speech quality. In this project, you will port the ITU’s reference source code to the Taxes Instrument TMS320C5416 DSP chip using the Code Composer Studio and DSP starter kit. The resulting software should be able to encode and decode speech using G.723.1 in real time.

Pre-requisite: C/C++, DSP concepts

What will you learn: Speech processing techniques, speech coding, DSP programming, fixed-point implementation techniques, real-time programming

 

M.W. Mak's homepage