Joint audio-video processing for biometric speaker identification

Publication:
Joint audio-video processing for biometric speaker identification

dc.conference.location	Baltimore, USA
dc.conference.organizer	4th International Conference on Multimedia and Expo (ICME 2003)
dc.contributor.department	MVGL (Multimedia, Vision and Graphics Laboratory)
dc.contributor.facultymember	Yes
dc.contributor.kuauthor	Erzin, Engin
dc.contributor.kuauthor	Kanak, Alper
dc.contributor.kuauthor	Tekalp, Ahmet Murat
dc.contributor.kuauthor	Yemez, Yücel
dc.contributor.schoolcollegeinstitute	Laboratory
dc.date	JUL 06-09, 2003
dc.date.accessioned	2024-11-09T22:59:03Z
dc.date.issued	2003
dc.description.abstract	We present a bimodal audio-visual speaker identification system. The objective is to improve the recognition performance over conventional unimodal schemes. The proposed system exploits not only the temporal and spatial correlations existing in the speech and video signals of a speaker, but also the cross-correlation between these two modalities. Lip images extracted from each video frame are transformed onto an eigenspace. The obtained eigenlip coefficients are interpolated to match the rate of the speech signal and fused with Mel frequency cepstral coefficients (MFCC) of the corresponding speech signal. The resulting joint feature vectors are used to train and test a hidden Markov model (HMM) based identification system. Experimental results are included to demonstrate the system performance.
dc.description.fulltext	No
dc.description.harvestedfrom	Manual
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.openaccess	NO
dc.description.peerreviewstatus	N/A
dc.description.publisherscope	International
dc.description.readpublish	N/A
dc.description.sponsoredbyTubitakEu	N/A
dc.description.studentonlypublication	No
dc.description.studentpublication	Yes
dc.description.version	N/A
dc.identifier.WoSQuartile	N/A
dc.identifier.embargo	N/A
dc.identifier.endpage	564
dc.identifier.isbn	0780379659
dc.identifier.scopus	2-s2.0-0141590222
dc.identifier.startpage	561
dc.identifier.uri	https://hdl.handle.net/20.500.14288/7831
dc.identifier.wos	000185328600095
dc.keywords	Speech
dc.keywords	Multimodal biometrics
dc.keywords	Face recognition
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.affiliation	Koç University
dc.relation.collection	Koç University Institutional Repository
dc.relation.ispartof	2003 International Conference on Multimedia and Expo, Vol III, Proceedings
dc.relation.openaccess	N/A
dc.rights	N/A
dc.subject	Computer science
dc.subject	Artificial intelligence
dc.subject	Engineering
dc.subject	Electrical and electronic engineering
dc.subject	Imaging science
dc.subject	Photographic technology
dc.title	Joint audio-video processing for biometric speaker identification
dc.type	Conference Proceeding
dspace.entity.type	Publication
local.contributor.kuauthor	Kanak, Alper
local.contributor.kuauthor	Erzin, Engin
local.contributor.kuauthor	Yemez, Yücel
local.contributor.kuauthor	Tekalp, Ahmet Murat
relation.isOrgUnitOfPublication	cb6bbbf6-fd19-4052-b581-f591a9748d21
relation.isOrgUnitOfPublication.latestForDiscovery	cb6bbbf6-fd19-4052-b581-f591a9748d21
relation.isParentOrgUnitOfPublication	20385dee-35e7-484b-8da6-ddcc08271d96
relation.isParentOrgUnitOfPublication.latestForDiscovery	20385dee-35e7-484b-8da6-ddcc08271d96

Collections

Publications without Fulltext

Publication: Joint audio-video processing for biometric speaker identification

Files

Collections

Publication:
Joint audio-video processing for biometric speaker identification