Emotion dependent domain adaptation for speech driven affective facial feature synthesis

Publication:
Emotion dependent domain adaptation for speech driven affective facial feature synthesis

Files

10891.pdf (1.46 MB)

Departments

Organizational Unit

Department of Electrical and Electronics Engineering

Organizational Unit

KUIS AI (Koç University & İş Bank Artificial Intelligence Center)

School / College / Institute

Organizational Unit

College of Engineering

Organizational Unit

Research Center

KU-Authors

Erzin, Engin

Sadiq, Rizwan

Date

2022

Type

Journal Article

Embargo Status

NO

Abstract

Although speech driven facial animation has been studied extensively in the literature, works focusing on the affective content of the speech are limited. This is mostly due to the scarcity of affective audio-visual data. In this article, we improve the affective facial animation using domain adaptation by partially reducing the data scarcity. We first define a domain adaptation to map affective and neutral speech representations to a common latent space in which cross-domain bias is smaller. Then the domain adaptation is used to augment affective representations for each emotion category, including angry, disgust, fear, happy, sad, surprise, and neutral, so that we can better train emotion-dependent deep audio-to-visual (A2V) mapping models. Based on the emotion-dependent deep A2V models, the proposed affective facial synthesis system is realized in two stages: first, speech emotion recognition extracts soft emotion category likelihoods for the utterances; then a soft fusion of the emotion-dependent A2V mapping outputs form the affective facial synthesis. Experimental evaluations are performed on the SAVEE audio-visual dataset. The proposed models are assessed with objective and subjective evaluations. The proposed affective A2V system achieves significant MSE loss improvements in comparison to the recent literature. Furthermore, the resulting facial animations of the proposed system are preferred over the baseline animations in the subjective evaluations.

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Subject

Computer science, Artificial intelligence, Cybernetics

Source

IEEE Transactions on Affective Computing

DOI

10.1109/TAFFC.2020.3008456

URI

https://hdl.handle.net/20.500.14288/3042

Collections

Publications with Fulltext

Full item page

3

Views

14

Downloads

View PlumX Details

Publication: Emotion dependent domain adaptation for speech driven affective facial feature synthesis

Files

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Editor & Affiliation

Compiler & Affiliation

Translator

Other Contributor

Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

3

Views

14

Downloads

Publication:
Emotion dependent domain adaptation for speech driven affective facial feature synthesis