Publication: Joint audio-video processing for biometric speaker identification
| dc.conference.location | Baltimore, USA | |
| dc.conference.organizer | 4th International Conference on Multimedia and Expo (ICME 2003) | |
| dc.contributor.department | MVGL (Multimedia, Vision and Graphics Laboratory) | |
| dc.contributor.facultymember | Yes | |
| dc.contributor.kuauthor | Erzin, Engin | |
| dc.contributor.kuauthor | Kanak, Alper | |
| dc.contributor.kuauthor | Tekalp, Ahmet Murat | |
| dc.contributor.kuauthor | Yemez, Yücel | |
| dc.contributor.schoolcollegeinstitute | Laboratory | |
| dc.date | JUL 06-09, 2003 | |
| dc.date.accessioned | 2024-11-09T22:59:03Z | |
| dc.date.issued | 2003 | |
| dc.description.abstract | We present a bimodal audio-visual speaker identification system. The objective is to improve the recognition performance over conventional unimodal schemes. The proposed system exploits not only the temporal and spatial correlations existing in the speech and video signals of a speaker, but also the cross-correlation between these two modalities. Lip images extracted from each video frame are transformed onto an eigenspace. The obtained eigenlip coefficients are interpolated to match the rate of the speech signal and fused with Mel frequency cepstral coefficients (MFCC) of the corresponding speech signal. The resulting joint feature vectors are used to train and test a hidden Markov model (HMM) based identification system. Experimental results are included to demonstrate the system performance. | |
| dc.description.fulltext | No | |
| dc.description.harvestedfrom | Manual | |
| dc.description.indexedby | WOS | |
| dc.description.indexedby | Scopus | |
| dc.description.openaccess | NO | |
| dc.description.peerreviewstatus | N/A | |
| dc.description.publisherscope | International | |
| dc.description.readpublish | N/A | |
| dc.description.sponsoredbyTubitakEu | N/A | |
| dc.description.version | N/A | |
| dc.identifier.embargo | N/A | |
| dc.identifier.endpage | 564 | |
| dc.identifier.isbn | 0780379659 | |
| dc.identifier.quartile | N/A | |
| dc.identifier.scopus | 2-s2.0-0141590222 | |
| dc.identifier.startpage | 561 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14288/7831 | |
| dc.identifier.wos | 000185328600095 | |
| dc.keywords | Speech | |
| dc.keywords | Multimodal biometrics | |
| dc.keywords | Face recognition | |
| dc.language.iso | eng | |
| dc.publisher | Institute of Electrical and Electronics Engineers | |
| dc.relation.affiliation | Koç University | |
| dc.relation.collection | Koç University Institutional Repository | |
| dc.relation.ispartof | 2003 International Conference on Multimedia and Expo, Vol III, Proceedings | |
| dc.relation.openaccess | N/A | |
| dc.rights | N/A | |
| dc.subject | Computer science | |
| dc.subject | Artificial intelligence | |
| dc.subject | Engineering | |
| dc.subject | Electrical and electronic engineering | |
| dc.subject | Imaging science | |
| dc.subject | Photographic technology | |
| dc.title | Joint audio-video processing for biometric speaker identification | |
| dc.type | Conference Proceeding | |
| dspace.entity.type | Publication | |
| local.contributor.kuauthor | Kanak, Alper | |
| local.contributor.kuauthor | Erzin, Engin | |
| local.contributor.kuauthor | Yemez, Yücel | |
| local.contributor.kuauthor | Tekalp, Ahmet Murat | |
| relation.isOrgUnitOfPublication | cb6bbbf6-fd19-4052-b581-f591a9748d21 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | cb6bbbf6-fd19-4052-b581-f591a9748d21 | |
| relation.isParentOrgUnitOfPublication | 20385dee-35e7-484b-8da6-ddcc08271d96 | |
| relation.isParentOrgUnitOfPublication.latestForDiscovery | 20385dee-35e7-484b-8da6-ddcc08271d96 |
