Publication:
Speech driven 3D head gesture synthesis

dc.contributor.coauthorErdem, Tanju A.
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentN/A
dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.kuauthorYemez, Yücel
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorSargın, Mehmet Emre
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileMaster Student
dc.contributor.kuprofileFaculty Member
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid107907
dc.contributor.yokid34503
dc.contributor.yokidN/A
dc.contributor.yokid26207
dc.date.accessioned2024-11-09T23:20:38Z
dc.date.issued2006
dc.description.abstractIn this paper, we present a speech driven natural head gesture analysis and synthesis system. The proposed system assumes that sharp head movements are correlated with prominence in speech. For analysis, a binocular camera system is employed to capture the head motion of a talking person. The motion parameters associated with the 3D head motion are then used for extraction of the repetitive head gestures. In parallel, prosodic events are detected using an HMM structure with pitch and formant frequencies and speech intensity as audio features. For synthesis, the head motion parameters are estimated from the prosodic events based on a gesture-speech correlation model and then the associated Euler angles are used for speech driven animation of a 3D personalized talking head model. Results on head motion feature extraction, prosodic event detection and correlation modelling are provided. / Bu yazıda, konuşmaya dayalı doğal bir baş hareketi analiz ve sentez sistemi sunuyoruz. Önerilen sistem, keskin kafa hareketlerinin konuşmadaki belirginlik ile ilişkili olduğunu varsayar. Analiz için, konuşan bir kişinin baş hareketini yakalamak için bir dürbün kamera sistemi kullanılır. 3B kafa hareketiyle ilişkili hareket parametreleri daha sonra tekrarlayan baş hareketlerinin çıkarılması için kullanılır. Buna paralel olarak, prozodik olaylar, ses özellikleri olarak perde ve formant frekansları ve konuşma yoğunluğu ile bir HMM yapısı kullanılarak tespit edilir. Sentez için, baş hareketi parametreleri, bir jest-konuşma korelasyon modeline dayalı prozodik olaylardan tahmin edilir ve ardından ilişkili Euler açıları, bir 3B kişiselleştirilmiş konuşan kafa modelinin konuşmaya dayalı animasyonu için kullanılır. Kafa hareketi öznitelik çıkarımı, prozodik olay tespiti ve korelasyon modellemesi ile ilgili sonuçlar sağlanır.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.volume2006
dc.identifier.doi10.1109/SIU.2006.1659683
dc.identifier.isbn1424-4023-95
dc.identifier.isbn9781-4244-0239-7
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-34247107135&doi=10.1109%2fSIU.2006.1659683&partnerID=40&md5=788651926aa81cc5b0256be723f4ce80
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-34247107135
dc.identifier.urihttps://IEEExplore.IEEE.org/stamp/stamp.jsp?tp=&arnumber=1659683
dc.identifier.urihttps://hdl.handle.net/20.500.14288/10757
dc.identifier.wos245347800061
dc.keywordsBinocular vision
dc.keywordsCameras
dc.keywordsCorrelation theory
dc.keywordsFeature extraction
dc.keywordsSpeech synthesis
dc.keywordsThree dimensional
dc.keywordsBinocular camera system
dc.keywordsEuler angles
dc.keywordsFormant frequencies
dc.keywordsSpeech intensity
dc.keywordsGesture recognition
dc.languageTurkish
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.source2006 IEEE 14th Signal Processing and Communications Applications Conference
dc.subjectEngineering
dc.titleSpeech driven 3D head gesture synthesis
dc.title.alternativeKonuşma ile sürülen kafa jesti analizi ve sentezi
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0002-7515-3138
local.contributor.authorid0000-0002-2715-2368
local.contributor.authoridN/A
local.contributor.authorid0000-0003-1465-8121
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorSargın, Mehmet Emre
local.contributor.kuauthorTekalp, Ahmet Murat
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication.latestForDiscovery21598063-a7c5-420d-91ba-0cc9b2db0ea0

Files