Speech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction

Publication:
Speech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction

dc.contributor.department	Department of Computer Engineering
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Erzin, Engin
dc.contributor.kuauthor	Hussain, Nusrah
dc.contributor.kuauthor	Sezgin, Tevfik Metin
dc.contributor.kuauthor	Yemez, Yücel
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T12:26:56Z
dc.date.issued	2019
dc.description.abstract	We present a novel method for training a social robot to generate backchannels during human-robot interaction. We address the problem within an off-policy reinforcement learning framework, and show how a robot may learn to produce non-verbal backchannels like laughs, when trained to maximize the engagement and attention of the user. A major contribution of this work is the formulation of the problem as a Markov decision process (MDP) with states defined by the speech activity of the user and rewards generated by quantified engagement levels. The problem that we address falls into the class of applications where unlimited interaction with the environment is not possible (our environment being a human) because it may be time-consuming, costly, impracticable or even dangerous in case a bad policy is executed. Therefore, we introduce deep Q-network (DQN) in a batch reinforcement learning framework, where an optimal policy is learned from a batch data collected using a more controlled policy. We suggest the use of human-to-human dyadic interaction datasets as a batch of trajectories to train an agent for engaging interactions. Our experiments demonstrate the potential of our method to train a robot for engaging behaviors in an offline manner.
dc.description.fulltext	YES
dc.description.indexedby	Scopus
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	TÜBİTAK
dc.description.sponsorship	Scientific and Technological Research Council of Turkey (TÜBİTAK)
dc.description.sponsorship	Higher Education Commission (HEC) Pakistan.
dc.description.version	Author's final manuscript
dc.identifier.doi	10.21437/Interspeech.2019-2521
dc.identifier.embargo	NO
dc.identifier.filenameinventoryno	IR01985
dc.identifier.issn	2308-457X
dc.identifier.quartile	N/A
dc.identifier.scopus	2-s2.0-85074710071
dc.identifier.uri	https://doi.org/10.21437/Interspeech.2019-2521
dc.keywords	Backchannels
dc.keywords	Engagement
dc.keywords	Human-robot interaction
dc.keywords	Reinforcement learning
dc.language.iso	eng
dc.publisher	International Speech Communication Association ( ISCA)
dc.relation.grantno	2.17E+42
dc.relation.ispartof	Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH)
dc.relation.uri	http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/8597
dc.subject	Reinforcement learning
dc.subject	Learning algorithms
dc.subject	Policy gradient
dc.title	Speech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction
dc.type	Conference Proceeding
dspace.entity.type	Publication
local.contributor.kuauthor	Hussain, Nusrah
local.contributor.kuauthor	Erzin, Engin
local.contributor.kuauthor	Sezgin, Tevfik Metin
local.contributor.kuauthor	Yemez, Yücel
local.publication.orgunit1	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1	College of Engineering
local.publication.orgunit2	Department of Computer Engineering
local.publication.orgunit2	Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 8597.pdf
Size:: 477.89 KB
Format:: Adobe Portable Document Format

Download

Collections

Publications with Fulltext

Publication: Speech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction

Files

Original bundle

Collections

Publication:
Speech driven backchannel generation using deep Q-network for enhancing engagement in human-robot interaction