Research Projects of Dr. M.W. Mak

Discriminative Models for Biological Sequence Labeling and Segmentation

 Because of the aging population, the pharmaceutical industry in China (including Hong Kong) has experienced a strong growth in recent years. Labeling and segmenting amino acids in protein sequences, such as the determination of signal-peptide cleavage sites, is an important process in drug design. Because performing such task by experimental means is too costly and time consuming, machine learning techniques have become increasingly important for the pharmaceutical industry. Neural networks and hidden Markov models have been the prevailing machine learning approaches to determining the cleavage sites of signal peptides. These approaches, however, have limitations in that their performance is highly dependent on the feature encoding schemes and that longrange dependences between labels and amino acids cannot be properly modeled. This project aims to alleviate these limitations by using discriminative models such as conditional random fields. To maximize the information extracted from protein sequences, the project proposes using the properties of short amino acid segments to determine real-value feature functions for constructing conditional random fields. The use of real-value features instead of Boolean ones allows us to use a wide range of amino acid properties that are relevant to the task, thus facilitating the incorporation of biological knowledge into the predictors. The ultimate goal of the project is to have a systematic selection of relevant features that can improve prediction accuracy. The proposed work will provide insight into some fundamental machine learning models. The proposed algorithm is also valuable to a variety of problem domains (e.g., speech and language processing) in which discriminative models play an important role.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: RGC Competitive Bids, 2009.

 

Self-Supervised Feature Selection for Sequence Classification in Bioinformatics

 In recent years, we have witnessed a strong growth in the pharmaceutical industry in China (including Hong Kong), primarily because of the aging population in this region. Classification of proteins based on their amino acid sequences is an important process in drug design. Because performing such process by experimental means is too time consuming, machine learning techniques have become increasingly important for the pharmaceutical industry. One prevailing approach to protein classification is to perform pairwise comparisons between amino acid sequences. However, such method can easily lead to the curse of dimensionality and demands considerable computation resources. This project aims to alleviate these limitations by using feature selection techniques, i.e., selecting the features that are relevant to the classification task and removing those that are redundant. The proposal introduces two new types of learning, namely self-supervised and symmetric-doubly supervised learning, for feature selection. These learning scenarios provide theoretic justifications on why a particular set of features should be selected. To facilitate the fusion of different selection criteria and strategies, a pairwise scoring technique is proposed to convert the self-supervised scenario to the symmetric-doubly supervised one. The ultimate goal is to have a systematic selection of relevant features, which can improve prediction accuracy and computation efficiency. The proposed work will provide insight into some fundamental machine learning models. The proposed algorithm is also valuable to a variety of problem domains (e.g., biometrics) in which pairwise scoring play an important role.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: RGC Competitive Bids, 2008.

 

Homology-Based Kernel Methods for Sequence Classification in Bioinformatics

The aging population in Hong Kong and mainland China leads to a significant growth in pharmaceutical industry in recent years. Prediction of protein functions and subcellular locations is an important process in drug design. Because determining this information by experimental means is time consuming, machine learning has become indispensable tools for pharmaceutical industry to enhance the effectiveness and efficacy of drugs. Currently, subcellular locations of proteins are typically determined by looking at their corresponding amino acid sequences. Although the performance of sequencebased methods has been improving over the years, most of them lack a sound theoretic justification to guarantee similar performance for new data. This project aims to develop a kernel-based classification method for proteins’ subcellular localization, and provides a theoretic justification to ensure predictable performance on new sequences. We will investigate the trade-off between the diagonal dominance of kernel matrices and Mercer’s condition, which will lead to an effective design guideline for constructing kernel-based predictors. The ultimate goal is to have a systematic selection of kernels, which can improve prediction accuracy and computation efficiency. Our proposed algorithm is also valuable to a variety of problem domains (e.g., biometrics) in which kernel methods play an important role.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: RGC Competitive Bids, 2007.

 

Articulatory Feature-Based Pronunciation Modeling for Robust Speaker Verification

Conventional voice biometric systems typically model the vocal-tract characteristics of speakers by extracting the low-level spectral information from speech signals. These features, however, are known to be sensitive to channel mismatch and background noise. It is commonly believed that apart from using spectral contents, humans also recognize speakers based on their speaking style, prosody, intonation, accent, pronunciation characteristics, and so on. These high-level features carry the personality traits of individuals and are expected to be less susceptible to channel effects and background noise. This project aims to (1) capture the pronunciation characteristics of speakers by modeling how they articulate speech and (2) combine the pronunciation characteristics with spectral features for speaker verification. The project will provide new solutions to some of the practical problems encountered by speaker verification researchers today. These solutions will potentially help telecommunication service providers and financial service providers to open up new markets.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: RGC Direct Allocation

 

Coherence Models for Microarray Data Analysis

With the recent advances in DNA microarray technology, it has become possible to measure the expression level of thousands of genes across hundreds of experimental conditions. The ability to discover hidden patterns in gene expression data has significant impact on drug design and the development of new treatments with maximum efficacy and minimum side effects.

Machine learning techniques offer a viable approach to cluster discovery from microarray data, which involves identifying and classifying biologically relevant groups in genes and conditions. It has been recognized that genes (whether or not they belong to the same gene group) may be co-expressed via a variety of pathways. Therefore, they can be adequately described by a diversity of coherence models. In fact, it is known that a gene may participate in multiple pathways that may or may not be co-active under all conditions. It is therefore biologically meaningful to simultaneously divide genes into functional groups and conditions into co-active categories – leading to the so-called biclustering analysis. For this, we have proposed a comprehensive set of coherence models to cope with various plausible regulation processes. Furthermore, a multi-modality biclustering analysis based on the fusion of different coherence models appears to be promising because the expression level of genes from the same group may follow more than one coherence models. This proposal aims to (1) extend our biclustering algorithms to more difficult genomic dataset (e.g., lymphoma) and (2) conduct performance analysis to confirm that the proposed multi-modality approach enjoys the advantage of high prediction performance.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: PolyU Internal Compeitive Research Grant, 2006.

 

Mobile Phone-Based Speaker Verification via Blind Stochastic Feature Transformation

While today’s speaker verification systems perform reasonably well under controlled conditions, their performance is often compromised under real-world environments. In particular, variations in handset characteristics are known to be the major cause of performance degradation. Research has found that the effect of handset variations can be greatly reduced if handset characteristics are known a priori. However, this requirement limits the scale of the systems because maintaining a handset database for storing the information of all possible handset models is a great challenge. Our proposal overcomes this problem by means of a blind feature transformation approach in which the transformation parameters are determined online without any a priori knowledge of handset characteristics, which makes the method more appropriate for large-scale deployment. The project will provide new solutions to some of the practical problems encountered by speaker verification researchers today. These solutions will potentially help telecommunication service providers and financial service providers to open up new markets.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: RGC Competitive Bids, 2005.

 

Multi-Sample Decision Fusion for Biometric Verification

Over 85% of the population in Hong Kong uses mobile phones, and most of them are willing to carry out financial transactions over wireless networks. However, there is now a growing concern about the security of these transactions. In particular, prevailing remote access systems, which determine the eligibility of users by personal identity numbers, pose a high security risk. This project aims to improve the security of these systems by using biometric technologies. These technologies allow a system to verify its users on the basis of their physiological characteristics, such as voices, fingerprints and face patterns, or some aspect of behaviour, such as handwriting or keystroke patterns. Since the means for biometric systems to identify a person is not based on what he or she knows (a code), or possesses (a card), but on what he or she has (a characteristic), the possibility of forgery can be greatly reduced.

Most biometric authentication systems take one sample (e.g. an utterance or a video shot) from their users in a verification session. To improve the reliability of the verification decisions, some systems require their users to provide more than one sample during verification; the average scores of these samples are then used for making decisions on verification. However, averaging the scores may not produce optimal decisions because this approach considers the patterns in the samples as being equally reliable, which is often not the case in practice. In this project, we propose a novel approach to determining the reliability of individual frame-based feature vectors to combine the scores of the independent samples gathered from users during verification. The proposed fusion approach is very general and is potentially applicable to multi-sample, multi-modal biometric authentication. The project will provide new solutions to some of the practical problems encountered by biometrics researchers today. These solutions will potentially help telecommunication service providers and financial service providers to open up new markets.

Investigator: M.W. Mak and S.Y. Kung

Funding Source: RGC Competitive Bids, 2004.

 

Probabilistic Decision Fusion for Multimodal Person Verification

Financial transactions over wireless networks have become increasingly popular in recent years. However, there is now a growing concern for the security of these transactions. In particular, prevailing remote access systems, which determine the eligibility of users by personal identity numbers, pose a high security risk. This project aims to improve the security of these systems by combining two biometric technologies: speaker verification and face recognition. These technologies allow a system to verify its users by recognizing the unique characteristics contained in the users’ voice and face. Current biometric authentication systems typically consider one biometric feature only (e.g. face, voice, or fingerprint, etc.). While these systems perform reasonably well under controlled conditions, their performance is often compromised under real-world environments. We propose to improve the robustness of these systems by fusing the information gathered from both the audio and visual modalities. A novel approach to determine the reliability of the audio and visual sources is proposed. This reliability information will be used to combine the decisions made by the classifiers in the two modalities. The project will provide new solutions to some of the practical problems encountered by biometrics researchers today. These solutions will potentially help security product manufacturers and financial service providers to open up new markets.

 

Investigator: M.W. Mak and S.Y. Kung

Funding Source: Central Research Grant

 

Towards Multi-modal Human-computer Dialog Interactions with Minimally Intrusive Biometric Security Functions

This is a group research project involving researchers from three universities in Hong Kong: CUHK, HKPolyU and HKUST. The project aims to develop human-centric interface technologies to support secure computing by a diversity of users in a variety of usage contexts. The work to be done in the HKPolyU include the followings:

  • Cross validation of speech data integrity via lip-tracking for biometric applications

  • Reducing transducer distortions for speaker authentication

Investigators:
Name Institution
CHING, P.C. Dept. of Electronic Eng., CUHK
MAK, Brian Dept. of Computer Sicence, HKUST
MAK, Man Wai Dept. of Electronic and Information Eng,. HKPolyU
MENG, Helen Dept. of Systems Eng. & Eng. Management, CUHK
MOON, Y.S. Dept. of Computer Science & Eng., CUHK
SIU, Man Hung Dept. of Electrical and Electronic Eng., HKUST
LEE, Tan Dept. of Electronic Eng., CUHK
TANG, Xiao Ou Dept. of Information Eng., CUHK

Funding Source: RGC Central Allocation Vote

 

Environment Adaptation for Distributed Speaker Verification

This project aims to develop environment adaptation techniques (including feature transformation and modal adaptation) for speaker verification over wireless networks and the Internet. Another purpose of this project is to combine the environment adaptation techniques with a client-side front-end processing approach recently standardized by the European Telecommunications Standard Institute (ETSI) for distributed speaker verification.

Investigator: M.W. Mak

Funding Source: RGC Direct Allocation

 

Non-linear Stochastic Matching for Robust Speaker Verification

While today’s speaker verification systems perform reasonably well under controlled conditions, their performance is often compromised under real-world environments. In particular, variations in handset characteristics are known to be the major cause of performance degradation. Our proposal is to minimize the effects resulting from transducer variation. The proposed approaches overcome the limitations of conventional channel compensation methods by looking at the non-linear characteristics of telephone handsets. A novel non-linear probabilistic transformation method will be derived and evaluated. The project will provide new solutions to some of the practical problems encountered by speaker verification researchers today. These solutions will potentially help security product manufacturers and telephone-based transaction service providers open up new markets.

 

Investigators: M.W. Mak and S.Y. Kung

Funding Source: RGC Competitive Bids (PolyU 5131/02E)

 

Handset Mismatch Compensation for Robust Speaker Verification

This project aims at (1) developing handset mismatch compensation algorithms for speaker verification systems and (2) constructing a Cantonese telephone speech corpus for speaker verification research. Most channel compensation techniques assume that the telephone channel can be approximated by a linear filter. However, telephone handsets typically exhibit non-linear characteristics, suggesting that linear filtering addresses only part of the problem. For this project, we propose a non-linear feature mapper and a probabilistic channel equalizer that integrate the non-linear handset characteristics into the channel compensation process.
 

Funding Source: RGC Competitive Bids (PolyU 5129/01E)

Investigators: M.W. Mak and S.Y. Kung

 

Stochastic Model Adaptation for Robust Speech/Speaker Recognition

The performance of current speech/speaker recognition systems is often affected by the acoustic environment in which the systems are operated. For example, in telephone-based speaker verification, speakers tend to use different telephone handsets in different environments (e.g. office and home). Variation in handset’ characteristics can introduce severe speech variability even though the speech is uttered by the same speaker. Therefore, it is very important for a speaker model to be able to accommodate new acoustic environments. Furthermore, a practical speaker verification system also needs to adapt itself in order to accommodate the change in speaker characteristics over time. This is because speakers often sound different from time to time, a phenomenon known as intra-speaker variability. In this project, we propose to address the above issues by developing a temporally adaptive probabilistic neural network. Training algorithms and adaptation mechanisms, which will be based on our previous work on neural network learning algorithms, will be derived. The network performance will be evaluated using real-world data. 
 

Investigators: M.W. Mak and W.C. Siu

Funding Source: ASD Project

 

Acoustic and Voice Processing

In recent years, speech recognition systems, internet telephony, and video conferencing systems have been employed in a variety of real environments. However, in many practical situations, ambient noise, reverberation, and poor quality of microphones can degrade the performance of these systems drastically. Therefore, it is necessary to develop enhancement algorithms to improve the performance of these systems in adverse acoustic environment. This project is to investigate microphone characteristics and the human auditory system in order to enhance channel distorted, noisy speech for robust speech recognition and teleconferencing.
 

Investigators: M.W. Mak and W.C. Siu

Funding Source: ASD Project

 

Stochastic Matching Techniques for Robust Speaker Recognition

Today’s speaker recognition systems in laboratory environment have reached a very high level of performance. However, several technical issues (such as channel robustness) need to be resolved before these systems can be commercialized. This project is to resolve these issues. In particular, this project aims to develop a set of model-based and feature-based transformation techniques for robust speaker recognition. Parameter estimation algorithms based on the maximum likelihood (ML) principles and maximum a posteriori (MAP) principles will be derived. 

 

Investigators: M.W. Mak

Funding Source: Central Research Grant

 


 


M.W. Mak's Homepage

http://www.eie.polyu.edu.hk/~mwmak/mypage.htm