Publication:
Audio-visual prediction of head-nod and turn-taking events in dyadic interactions

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorSezgin, Tevfik Metin
dc.contributor.kuauthorTürker, Bekir Berker
dc.contributor.kuauthorYemez, Yücel
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T23:51:28Z
dc.date.issued2018
dc.description.abstractHead-nods and turn-taking both significantly contribute conversational dynamics in dyadic interactions. Timely prediction and use of these events is quite valuable for dialog management systems in human-robot interaction. in this study, we present an audio-visual prediction framework for the head-nod and turn taking events that can also be utilized in real-time systems. Prediction systems based on Support vector Machines (SVM) and Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) are trained on human-human conversational data. Unimodal and multi-modal classification performances of head-nod and turn-taking events are reported over the IEMOCaP dataset.
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipTurkish Scientific and Technical Research Council (TUBITaK) [113E324, 217E040] This work is supported by Turkish Scientific and Technical Research Council (TUBITaK) under grant numbers 113E324 and 217E040.
dc.identifier.doi10.21437/interspeech.2018-2215
dc.identifier.isbn978-1-5108-7221-9
dc.identifier.issn2308-457X
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85054959957
dc.identifier.urihttps://doi.org/10.21437/interspeech.2018-2215
dc.identifier.urihttps://hdl.handle.net/20.500.14288/14715
dc.identifier.wos465363900364
dc.keywordsHead-nod
dc.keywordsTurn-taking
dc.keywordsSocial signals
dc.keywordsEvent prediction
dc.keywordsDyadic conversations
dc.keywordsHuman-robot interaction
dc.language.isoeng
dc.publisherIsca-int Speech Communication assoc
dc.relation.ispartof19th Annual Conference of the international Speech Communication Association (interspeech 2018), Vols 1-6: Speech Research for Emerging Markets in Multilingual Societies
dc.subjectComputer Science
dc.subjectArtificial intelligence
dc.subjectElectrical electronics engineering
dc.titleAudio-visual prediction of head-nod and turn-taking events in dyadic interactions
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorTürker, Bekir Berker
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorSezgin, Tevfik Metin
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1College of Engineering
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files