Emotion dependent domain adaptation for speech driven affective facial feature synthesis

Publication:
Emotion dependent domain adaptation for speech driven affective facial feature synthesis

dc.contributor.department	Department of Electrical and Electronics Engineering
dc.contributor.department	KUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.kuauthor	Erzin, Engin
dc.contributor.kuauthor	Sadiq, Rizwan
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	Research Center
dc.date.accessioned	2024-11-09T13:18:49Z
dc.date.issued	2022
dc.description.abstract	Although speech driven facial animation has been studied extensively in the literature, works focusing on the affective content of the speech are limited. This is mostly due to the scarcity of affective audio-visual data. In this article, we improve the affective facial animation using domain adaptation by partially reducing the data scarcity. We first define a domain adaptation to map affective and neutral speech representations to a common latent space in which cross-domain bias is smaller. Then the domain adaptation is used to augment affective representations for each emotion category, including angry, disgust, fear, happy, sad, surprise, and neutral, so that we can better train emotion-dependent deep audio-to-visual (A2V) mapping models. Based on the emotion-dependent deep A2V models, the proposed affective facial synthesis system is realized in two stages: first, speech emotion recognition extracts soft emotion category likelihoods for the utterances; then a soft fusion of the emotion-dependent A2V mapping outputs form the affective facial synthesis. Experimental evaluations are performed on the SAVEE audio-visual dataset. The proposed models are assessed with objective and subjective evaluations. The proposed affective A2V system achieves significant MSE loss improvements in comparison to the recent literature. Furthermore, the resulting facial animations of the proposed system are preferred over the baseline animations in the subjective evaluations.
dc.description.fulltext	YES
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.issue	3
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	TÜBİTAK
dc.description.sponsorship	Scientific and Technological Council of Turkey (TÜBİTAK)
dc.description.version	Author's final manuscript
dc.description.volume	13
dc.identifier.doi	10.1109/TAFFC.2020.3008456
dc.identifier.embargo	NO
dc.identifier.filenameinventoryno	IR04005
dc.identifier.issn	1949-3045
dc.identifier.quartile	Q1
dc.identifier.scopus	2-s2.0-85089294841
dc.identifier.uri	https://doi.org/10.1109/TAFFC.2020.3008456
dc.identifier.wos	849263500027
dc.keywords	Facial animation
dc.keywords	Hidden Markov models
dc.keywords	Adaptation models
dc.keywords	Speech recognition
dc.keywords	Feature extraction
dc.keywords	Data models
dc.keywords	Speech driven facial animation
dc.keywords	Affective facial animation
dc.keywords	Domain adaptation
dc.keywords	Transfer learning
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.grantno	2.17E+109
dc.relation.ispartof	IEEE Transactions on Affective Computing
dc.relation.uri	http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/10891
dc.subject	Computer science
dc.subject	Artificial intelligence
dc.subject	Cybernetics
dc.title	Emotion dependent domain adaptation for speech driven affective facial feature synthesis
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.kuauthor	Erzin, Engin
local.contributor.kuauthor	Sadiq, Rizwan
local.publication.orgunit1	College of Engineering
local.publication.orgunit1	Research Center
local.publication.orgunit2	KUIS AI (Koç University & İş Bank Artificial Intelligence Center)
local.publication.orgunit2	Department of Electrical and Electronics Engineering
relation.isOrgUnitOfPublication	21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication	77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery	21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	d437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 10891.pdf
Size:: 1.46 MB
Format:: Adobe Portable Document Format

Download

Collections

Publications with Fulltext

Publication: Emotion dependent domain adaptation for speech driven affective facial feature synthesis

Files

Original bundle

Collections

Publication:
Emotion dependent domain adaptation for speech driven affective facial feature synthesis