Publication:
Speech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorHussain, Nusrah
dc.contributor.kuauthorErzin, Engin
dc.contributor.kuauthorSezgin, Tevfik Metin
dc.contributor.kuauthorYemez, Yücel
dc.contributor.kuprofilePhD Student
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokidN/A
dc.contributor.yokid34503
dc.contributor.yokid18632
dc.contributor.yokid107907
dc.date.accessioned2024-11-09T12:26:56Z
dc.date.issued2019
dc.description.abstractWe present a novel method for training a social robot to generate backchannels during human-robot interaction. We address the problem within an off-policy reinforcement learning framework, and show how a robot may learn to produce non-verbal backchannels like laughs, when trained to maximize the engagement and attention of the user. A major contribution of this work is the formulation of the problem as a Markov decision process (MDP) with states defined by the speech activity of the user and rewards generated by quantified engagement levels. The problem that we address falls into the class of applications where unlimited interaction with the environment is not possible (our environment being a human) because it may be time-consuming, costly, impracticable or even dangerous in case a bad policy is executed. Therefore, we introduce deep Q-network (DQN) in a batch reinforcement learning framework, where an optimal policy is learned from a batch data collected using a more controlled policy. We suggest the use of human-to-human dyadic interaction datasets as a batch of trajectories to train an agent for engaging interactions. Our experiments demonstrate the potential of our method to train a robot for engaging behaviors in an offline manner.
dc.description.fulltextYES
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TÜBİTAK)
dc.description.sponsorshipHigher Education Commission (HEC) Pakistan.
dc.description.versionAuthor's final manuscript
dc.formatpdf
dc.identifier.doi10.21437/Interspeech.2019-2521
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR01985
dc.identifier.issn2308-457X
dc.identifier.linkhttps://doi.org/10.21437/Interspeech.2019-2521
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85074710071
dc.identifier.urihttps://hdl.handle.net/20.500.14288/1720
dc.keywordsBackchannels
dc.keywordsEngagement
dc.keywordsHuman-robot interaction
dc.keywordsReinforcement learning
dc.languageEnglish
dc.publisherInternational Speech Communication Association ( ISCA)
dc.relation.grantno2.17E+42
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/8597
dc.sourceProceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH)
dc.subjectReinforcement learning
dc.subjectLearning algorithms
dc.subjectPolicy gradient
dc.titleSpeech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authoridN/A
local.contributor.authorid0000-0002-2715-2368
local.contributor.authorid0000-0002-1524-1646
local.contributor.authorid0000-0002-7515-3138
local.contributor.kuauthorHussain, Nusrah
local.contributor.kuauthorErzin, Engin
local.contributor.kuauthorSezgin, Tevfik Metin
local.contributor.kuauthorYemez, Yücel
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
8597.pdf
Size:
477.89 KB
Format:
Adobe Portable Document Format