Publication:
Multimodal speaker identification using discriminative lip motion features

Placeholder

Program

KU Authors

Co-Authors

Advisor

Publication Date

2009

Language

English

Type

Book Chapter

Journal Title

Journal ISSN

Volume Title

Abstract

This chapter presents a multimodal speaker identification system that integrates audio, lip texture, and lip motion modalities, and the authors propose to use the "explicit" lip motion information that best represent the modality for the given problem. The work is presented in two stages: First, they consider several lip motion feature candidates such as dense motion features on the lip region, motion features on the outer lip contour, and lip shape features. Meanwhile, the authors introduce their main contribution, which is a novel two-stage, spatial-temporal discrimination analysis framework designed to obtain the best lip motion features. For speaker identification, the best lip motion features result in the highest discrimination among speakers. Next, they investigate the benefits of the inclusion of the best lip motion features for multimodal recognition. Audio, lip texture, and lip motion modalities are fused by the reliability weighted summation (RWS) decision rule, and hidden Markov model (HMM)-based modeling is performed for both unimodal and multimodal recognition. Experimental results indicate that discriminative grid-based lip motion features are proved to be more valuable and provide additional performance gains in speaker identification. © 2009, IGI Global.

Description

Source:

Visual Speech Recognition: Lip Segmentation and Mapping

Publisher:

IGI Global

Keywords:

Subject

Electrical electronics engineering, Computer engineering

Citation

Endorsement

Review

Supplemented By

Referenced By

Copy Rights Note

0

Views

0

Downloads

View PlumX Details