Publication:
Affective synthesis and animation of arm gestures from speech prosody

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorBozkurt, Elif
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorYemez, Yücel
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T23:05:35Z
dc.date.issued2020
dc.description.abstractIn human-to-human communication, speech signals carry rich emotional cues that are further emphasized by affect-expressive gestures. In this regard, automatic synthesis and animation of gestures accompanying affective verbal communication can help to create more naturalistic virtual agents in human-computer interaction systems. Speech-driven gesture synthesis can map emotional cues of the speech signal to affect-expressive gestures by modeling complex variability and timing relationships of speech and gesture. In this paper, we investigate the use of continuous affect attributes, which are activation, valence and dominance, for speech-driven affective synthesis and animation of arm gestures. To this effect, we present a statistical framework based on hidden semi-Markov models (HSMM), where states are gestures and observations are speech-prosody and continuous affect attributes. The proposed framework is evaluated considering four distinct HSMM structures which differ by their emission distributions. Evaluations are performed over the USC CreativeIT database in a speaker-independent setup. Among the four statistical structures, the conditional structure, which models observation distributions as prosody given affect, achieves the best performance under both objective and subjective evaluations.
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipTUBITAK [113E102] This work was supported by TUBITAK under grant number 113E102.
dc.description.volume119
dc.identifier.doi10.1016/j.specom.2020.02.005
dc.identifier.eissn1872-7182
dc.identifier.issn0167-6393
dc.identifier.scopus2-s2.0-85080918361
dc.identifier.urihttps://doi.org/10.1016/j.specom.2020.02.005
dc.identifier.urihttps://hdl.handle.net/20.500.14288/8837
dc.identifier.wos531017100001
dc.keywordsProsody analysis
dc.keywordsGesture segmentation
dc.keywordsArm gesture animation
dc.keywordsUnit selection
dc.keywordsHidden semi-markov models
dc.keywordsSpeech-driven affective gesture synthesis
dc.keywordsBody movement
dc.keywordsExpression
dc.keywordsDynamics
dc.keywordsEmotion
dc.language.isoeng
dc.publisherElsevier
dc.relation.ispartofSpeech Communication
dc.subjectAcoustics
dc.subjectComputer science
dc.titleAffective synthesis and animation of arm gestures from speech prosody
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.kuauthorBozkurt, Elif
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorYemez, Yücel
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1College of Engineering
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files