Publication:
Discriminative analysis of lip motion features for speaker identification and speech-reading

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorÇetingül, Hasan Ertan
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.kuauthorYemez, Yücel
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T23:34:15Z
dc.date.issued2006
dc.description.abstractThere have been several studies that jointly use audio, lip intensity, and lip geometry information for speaker identification and speech-reading applications. This paper proposes using explicit lip motion information, instead of or in addition to lip intensity and/or geometry information, for speaker identification and speech-reading within a unified feature selection and discrimination analysis framework, and addresses two important issues: 1) Is using explicit lip motion information useful, and, 2) if so, what are the best lip motion features for these two applications? The best lip motion features for speaker identification are considered to be those that result in the highest discrimination of individual speakers in a population, whereas for speech-reading, the best features are those providing the highest phoneme/word/phrase recognition rate. Several lip motion feature candidates have been considered including dense motion features within a bounding box about the lip, lip contour motion features, and combination of these with lip shape features. Furthermore, a novel two-stage, spatial, and temporal discrimination analysis is introduced to select the best lip motion features for speaker identification and speech-reading applications. Experimental results using an hidden-Markov-model-based recognition system indicate that using explicit lip motion information provides additional performance gains in both applications, and lip motion features prove more valuable in the case of speech-reading application.
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.indexedbyPubMed
dc.description.issue10
dc.description.openaccessYES
dc.description.sponsoredbyTubitakEuN/A
dc.description.volume15
dc.identifier.doi10.1109/TIP.2006.877528
dc.identifier.eissn1941-0042
dc.identifier.issn1057-7149
dc.identifier.scopus2-s2.0-33749187783
dc.identifier.urihttps://doi.org/10.1109/TIP.2006.877528
dc.identifier.urihttps://hdl.handle.net/20.500.14288/12317
dc.identifier.wos240776200003
dc.keywordsBayesian discriminative feature selection
dc.keywordsLip motion
dc.keywordsspeaker identification
dc.keywordsSpeech recognition
dc.keywordstemporal discriminative feature selection
dc.keywordsRecognition
dc.keywordsSegmentation
dc.language.isoeng
dc.publisherIeee-Inst Electrical Electronics Engineers Inc
dc.relation.ispartofIeee Transactions On Image Processing
dc.subjectComputer science
dc.subjectArtificial intelligence
dc.subjectEngineering
dc.subjectElectrical electronic engineering
dc.titleDiscriminative analysis of lip motion features for speaker identification and speech-reading
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.kuauthorÇetingül, Hasan Ertan
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorTekalp, Ahmet Murat
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1College of Engineering
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files