Publication: Emotion dependent domain adaptation for speech driven affective facial feature synthesis
dc.contributor.department | Department of Electrical and Electronics Engineering | |
dc.contributor.kuauthor | Erzin, Engin | |
dc.contributor.kuauthor | Sadiq, Rizwan | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.other | Department of Electrical and Electronics Engineering | |
dc.contributor.researchcenter | Koç Üniversitesi İş Bankası Yapay Zeka Uygulama ve Araştırma Merkezi (KUIS AI)/ Koç University İş Bank Artificial Intelligence Center (KUIS AI) | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.yokid | 34503 | |
dc.contributor.yokid | N/A | |
dc.date.accessioned | 2024-11-09T13:18:49Z | |
dc.date.issued | 2022 | |
dc.description.abstract | Although speech driven facial animation has been studied extensively in the literature, works focusing on the affective content of the speech are limited. This is mostly due to the scarcity of affective audio-visual data. In this article, we improve the affective facial animation using domain adaptation by partially reducing the data scarcity. We first define a domain adaptation to map affective and neutral speech representations to a common latent space in which cross-domain bias is smaller. Then the domain adaptation is used to augment affective representations for each emotion category, including angry, disgust, fear, happy, sad, surprise, and neutral, so that we can better train emotion-dependent deep audio-to-visual (A2V) mapping models. Based on the emotion-dependent deep A2V models, the proposed affective facial synthesis system is realized in two stages: first, speech emotion recognition extracts soft emotion category likelihoods for the utterances; then a soft fusion of the emotion-dependent A2V mapping outputs form the affective facial synthesis. Experimental evaluations are performed on the SAVEE audio-visual dataset. The proposed models are assessed with objective and subjective evaluations. The proposed affective A2V system achieves significant MSE loss improvements in comparison to the recent literature. Furthermore, the resulting facial animations of the proposed system are preferred over the baseline animations in the subjective evaluations. | |
dc.description.fulltext | YES | |
dc.description.indexedby | WoS | |
dc.description.indexedby | Scopus | |
dc.description.issue | 3 | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | TÜBİTAK | |
dc.description.sponsorship | Scientific and Technological Council of Turkey (TÜBİTAK) | |
dc.description.version | Author's final manuscript | |
dc.description.volume | 13 | |
dc.format | ||
dc.identifier.doi | 10.1109/TAFFC.2020.3008456 | |
dc.identifier.embargo | NO | |
dc.identifier.filenameinventoryno | IR04005 | |
dc.identifier.issn | 1949-3045 | |
dc.identifier.link | https://doi.org/10.1109/TAFFC.2020.3008456 | |
dc.identifier.quartile | Q1 | |
dc.identifier.scopus | 2-s2.0-85089294841 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/3042 | |
dc.identifier.wos | 849263500027 | |
dc.keywords | Facial animation | |
dc.keywords | Hidden Markov models | |
dc.keywords | Adaptation models | |
dc.keywords | Speech recognition | |
dc.keywords | Feature extraction | |
dc.keywords | Data models | |
dc.keywords | Speech driven facial animation | |
dc.keywords | Affective facial animation | |
dc.keywords | Domain adaptation | |
dc.keywords | Transfer learning | |
dc.language | English | |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | |
dc.relation.grantno | 2.17E+109 | |
dc.relation.uri | http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/10891 | |
dc.source | IEEE Transactions on Affective Computing | |
dc.subject | Computer science | |
dc.subject | Artificial intelligence | |
dc.subject | Cybernetics | |
dc.title | Emotion dependent domain adaptation for speech driven affective facial feature synthesis | |
dc.type | Journal Article | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-2715-2368 | |
local.contributor.authorid | N/A | |
local.contributor.kuauthor | Erzin, Engin | |
local.contributor.kuauthor | Sadiq, Rizwan | |
relation.isOrgUnitOfPublication | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 |
Files
Original bundle
1 - 1 of 1