Multimodal speech driven facial shape animation using deep neural networks

Publication:
Multimodal speech driven facial shape animation using deep neural networks

Departments

Organizational Unit

Department of Computer Engineering

Organizational Unit

Graduate School of Sciences and Engineering

School / College / Institute

Organizational Unit

College of Engineering

Organizational Unit

GRADUATE SCHOOL OF SCIENCES AND ENGINEERING

Upper Org Unit

KU-Authors

Asadiabadi, Sasan

Erzin, Engin

Sadiq, Rizwan

Date

2018

Type

Conference Proceeding

Embargo Status

N/A

Abstract

In this paper we present a deep learning multimodal approach for speech driven generation of face animations. Training a speaker independent model, capable of generating different emotions of the speaker, is crucial for realistic animations. Unlike the previous approaches which either use acoustic features or phoneme label features to estimate the facial movements, we utilize both modalities to generate natural looking speaker independent lip animations synchronized with affective speech. A phoneme-based model qualifies generation of speaker independent animation, whereas an acoustic feature-based model enables capturing affective variation during the animation generation. We show that our multimodal approach not only performs significantly better on affective data, but improves performance over neutral data as well. We evaluate the proposed multimodal speech-driven animation model using two large scale datasets, GRID and SAVEE, by reporting the mean squared error (MSE) over various network structures.

Publisher

Ieee

Subject

Engineering, Electrical electronic engineering

Source

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (Apsipa Asc)

URI

https://hdl.handle.net/20.500.14288/9884

Rights

N/A

Collections

Publications without Fulltext

Full item page

Publication: Multimodal speech driven facial shape animation using deep neural networks

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Editor & Affiliation

Compiler & Affiliation

Translator

Other Contributor

Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

0

Views

0

Downloads

Publication:
Multimodal speech driven facial shape animation using deep neural networks