Publication:
Multimodal speaker identification using discriminative lip motion features

dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorÇetingül, Hasan Ertan
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.kuauthorYemez, Yücel
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T23:35:37Z
dc.date.issued2009
dc.description.abstractThis chapter presents a multimodal speaker identification system that integrates audio, lip texture, and lip motion modalities, and the authors propose to use the "explicit" lip motion information that best represent the modality for the given problem. The work is presented in two stages: First, they consider several lip motion feature candidates such as dense motion features on the lip region, motion features on the outer lip contour, and lip shape features. Meanwhile, the authors introduce their main contribution, which is a novel two-stage, spatial-temporal discrimination analysis framework designed to obtain the best lip motion features. For speaker identification, the best lip motion features result in the highest discrimination among speakers. Next, they investigate the benefits of the inclusion of the best lip motion features for multimodal recognition. Audio, lip texture, and lip motion modalities are fused by the reliability weighted summation (RWS) decision rule, and hidden Markov model (HMM)-based modeling is performed for both unimodal and multimodal recognition. Experimental results indicate that discriminative grid-based lip motion features are proved to be more valuable and provide additional performance gains in speaker identification. © 2009, IGI Global.
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.identifier.doi10.4018/978-1-60566-186-5.ch016
dc.identifier.isbn9781-6056-6186-5
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-84900179389
dc.identifier.urihttps://doi.org/10.4018/978-1-60566-186-5.ch016
dc.identifier.urihttps://hdl.handle.net/20.500.14288/12531
dc.language.isoeng
dc.publisherIGI Global
dc.relation.ispartofVisual Speech Recognition: Lip Segmentation and Mapping
dc.subjectElectrical electronics engineering
dc.subjectComputer engineering
dc.titleMultimodal speaker identification using discriminative lip motion features
dc.typeBook Chapter
dspace.entity.typePublication
local.contributor.kuauthorTekalp, Ahmet Murat
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorÇetingül, Hasan Ertan
local.publication.orgunit1College of Engineering
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2Department of Electrical and Electronics Engineering
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files