Publication: Audio-visual prediction of head-nod and turn-taking events in dyadic interactions
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.department | Graduate School of Sciences and Engineering | |
dc.contributor.kuauthor | Erzin, Engin | |
dc.contributor.kuauthor | Sezgin, Tevfik Metin | |
dc.contributor.kuauthor | Türker, Bekir Berker | |
dc.contributor.kuauthor | Yemez, Yücel | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
dc.date.accessioned | 2024-11-09T23:51:28Z | |
dc.date.issued | 2018 | |
dc.description.abstract | Head-nods and turn-taking both significantly contribute conversational dynamics in dyadic interactions. Timely prediction and use of these events is quite valuable for dialog management systems in human-robot interaction. in this study, we present an audio-visual prediction framework for the head-nod and turn taking events that can also be utilized in real-time systems. Prediction systems based on Support vector Machines (SVM) and Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) are trained on human-human conversational data. Unimodal and multi-modal classification performances of head-nod and turn-taking events are reported over the IEMOCaP dataset. | |
dc.description.indexedby | WOS | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | NO | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | TÜBİTAK | |
dc.description.sponsorship | Turkish Scientific and Technical Research Council (TUBITaK) [113E324, 217E040] This work is supported by Turkish Scientific and Technical Research Council (TUBITaK) under grant numbers 113E324 and 217E040. | |
dc.identifier.doi | 10.21437/interspeech.2018-2215 | |
dc.identifier.isbn | 978-1-5108-7221-9 | |
dc.identifier.issn | 2308-457X | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85054959957 | |
dc.identifier.uri | https://doi.org/10.21437/interspeech.2018-2215 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/14715 | |
dc.identifier.wos | 465363900364 | |
dc.keywords | Head-nod | |
dc.keywords | Turn-taking | |
dc.keywords | Social signals | |
dc.keywords | Event prediction | |
dc.keywords | Dyadic conversations | |
dc.keywords | Human-robot interaction | |
dc.language.iso | eng | |
dc.publisher | Isca-int Speech Communication assoc | |
dc.relation.ispartof | 19th Annual Conference of the international Speech Communication Association (interspeech 2018), Vols 1-6: Speech Research for Emerging Markets in Multilingual Societies | |
dc.subject | Computer Science | |
dc.subject | Artificial intelligence | |
dc.subject | Electrical electronics engineering | |
dc.title | Audio-visual prediction of head-nod and turn-taking events in dyadic interactions | |
dc.type | Conference Proceeding | |
dspace.entity.type | Publication | |
local.contributor.kuauthor | Türker, Bekir Berker | |
local.contributor.kuauthor | Erzin, Engin | |
local.contributor.kuauthor | Yemez, Yücel | |
local.contributor.kuauthor | Sezgin, Tevfik Metin | |
local.publication.orgunit1 | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
local.publication.orgunit1 | College of Engineering | |
local.publication.orgunit2 | Department of Computer Engineering | |
local.publication.orgunit2 | Graduate School of Sciences and Engineering | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication | 3fc31c89-e803-4eb1-af6b-6258bc42c3d8 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isParentOrgUnitOfPublication | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 | |
relation.isParentOrgUnitOfPublication | 434c9663-2b11-4e66-9399-c863e2ebae43 | |
relation.isParentOrgUnitOfPublication.latestForDiscovery | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 |