Publication:
Speech-driven automatic facial expression synthesis

dc.contributor.coauthorBozkurt, Elif
dc.contributor.coauthorErdem, Cigdem Eroglu
dc.contributor.coauthorErdem, Tanju
dc.contributor.coauthorOezkan, Mehmet
dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.date.accessioned2024-11-09T22:56:47Z
dc.date.issued2008
dc.description.abstractThis paper focuses on the problem of automatically generating speech synchronous facial expressions for 3D talking heads. The proposed system is speaker and language independent. We parameterize speech data with prosody related features and spectral features together with their first and second order derivatives. Then, we classify the seven emotions in the dataset with two different classifiers: Gaussian mixture models (GMMs) and Hidden Markov Models (HMMs). Probability density function of the spectral feature space is modeled with a GMM for each emotion. Temporal patterns of the emotion dependent prosody contours are modeled with an HMM based classifier. We use the Berlin Emotional Speech dataset (EMO-DB) [1] during the experiments. GMM classifier has the best overall recognition rate 82.85% when cepstral features with delta and acceleration coefficients are used. HMM based classifier has lower recognition rates than the GMM based classifier. However, fusion of the two classifiers has 83.80% recognition rate on the average. Experimental results on automatic facial expression synthesis are encouraging.
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipEC within FP6 [511568] This work is supported by EC within FP6 under Grant 511568 with the acronym 3DTV
dc.identifier.isbn978-1-4244-1760-5
dc.identifier.issn2161-2021
dc.identifier.scopus2-s2.0-51149092366
dc.identifier.urihttps://hdl.handle.net/20.500.14288/7443
dc.identifier.wos258372100064
dc.keywordsEmotion recognition
dc.keywordsFacial expression synthesis
dc.keywordsClassifier fusion
dc.language.isoeng
dc.publisherIEEE
dc.relation.ispartof2008 3dtv-Conference: The True Vision - Capture, Transmission and Display of 3d Video
dc.subjectEngineering
dc.subjectElectrical electronic engineering
dc.subjectImaging science
dc.subjectPhotographic technology
dc.titleSpeech-driven automatic facial expression synthesis
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorTekalp, Ahmet Murat
local.publication.orgunit1College of Engineering
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Department of Electrical and Electronics Engineering
relation.isOrgUnitOfPublication21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files