Publication: Multimodal speaker identification using an adaptive classifier cascade based on modality reliability
dc.contributor.department | Department of Electrical and Electronics Engineering | |
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.kuauthor | Erzin, Engin | |
dc.contributor.kuauthor | Tekalp, Ahmet Murat | |
dc.contributor.kuauthor | Yemez, Yücel | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.date.accessioned | 2024-11-09T22:50:17Z | |
dc.date.issued | 2005 | |
dc.description.abstract | We present a multimodal open-set speaker identification system that integrates information coming from audio, face and lip motion modalities. For fusion of multiple modalities, we propose a new adaptive cascade rule that favors reliable modality combinations through a cascade of classifiers. The order of the classifiers in the cascade is adaptively determined based on the reliability of each modality combination. A novel reliability measure, that genuinely fits to the open-set speaker identification problem, is also proposed to assess accept or reject decisions of a classifier. A formal framework is developed based on probability of correct decision for analytical comparison of the proposed adaptive rule with other classifier combination rules. The proposed adaptive rule is more robust in the presence of unreliable modalities, and outperforms the hard-level max rule and soft-level weighted summation rule, provided that the employed reliability measure is effective in assessment of classifier decisions. Experimental results that support this assertion are provided. | |
dc.description.indexedby | WOS | |
dc.description.indexedby | Scopus | |
dc.description.issue | 5 | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | N/A | |
dc.description.volume | 7 | |
dc.identifier.doi | 10.1109/TMM.2005.854464 | |
dc.identifier.eissn | 1941-0077 | |
dc.identifier.issn | 1520-9210 | |
dc.identifier.quartile | Q1 | |
dc.identifier.scopus | 2-s2.0-26844533276 | |
dc.identifier.uri | https://doi.org/10.1109/TMM.2005.854464 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/6649 | |
dc.identifier.wos | 232084900005 | |
dc.keywords | Classifier combining | |
dc.keywords | Modality reliability | |
dc.keywords | Multimodal speaker identification fusion | |
dc.keywords | Face | |
dc.keywords | Combination | |
dc.keywords | Information | |
dc.keywords | Speech | |
dc.keywords | Verification | |
dc.keywords | Recognition | |
dc.language.iso | eng | |
dc.publisher | IEEE-Inst Electrical Electronics Engineers Inc | |
dc.relation.ispartof | IEEE Transactions on Multimedia | |
dc.subject | Computer science | |
dc.subject | Information systems | |
dc.subject | Engineering | |
dc.subject | Software engineering | |
dc.subject | Telecommunications | |
dc.title | Multimodal speaker identification using an adaptive classifier cascade based on modality reliability | |
dc.type | Journal Article | |
dspace.entity.type | Publication | |
local.contributor.kuauthor | Erzin, Engin | |
local.contributor.kuauthor | Yemez, Yücel | |
local.contributor.kuauthor | Tekalp, Ahmet Murat | |
local.publication.orgunit1 | College of Engineering | |
local.publication.orgunit2 | Department of Computer Engineering | |
local.publication.orgunit2 | Department of Electrical and Electronics Engineering | |
relation.isOrgUnitOfPublication | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isParentOrgUnitOfPublication | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 | |
relation.isParentOrgUnitOfPublication.latestForDiscovery | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 |