Publication: Engagement rewarded actor-critic with conservative Q-learning for speech-driven laughter backchannel generation
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.kuauthor | Bayramoğlu, Öykü Zeynep | |
dc.contributor.kuauthor | Erzin, Engin | |
dc.contributor.kuauthor | Sezgin, Tevfik Metin | |
dc.contributor.kuauthor | Yemez, Yücel | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.researchcenter | Koç Üniversitesi İş Bankası Yapay Zeka Uygulama ve Araştırma Merkezi (KUIS AI)/ Koç University İş Bank Artificial Intelligence Center (KUIS AI) | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | 34503 | |
dc.contributor.yokid | 18632 | |
dc.contributor.yokid | 107907 | |
dc.date.accessioned | 2024-11-09T13:56:20Z | |
dc.date.issued | 2021 | |
dc.description.abstract | We propose a speech-driven laughter backchannel generation model to reward engagement during human-agent interaction. We formulate the problem as a Markov decision process where speech signal represents the state and the objective is to maximize human engagement. Since online training is often impractical in the case of human-agent interaction, we utilize the existing human-to-human dyadic interaction datasets to train our agent for the backchannel generation task. We address the problem using an actor-critic method based on conservative Q-learning (CQL), that mitigates the distributional shift problem by suppressing Q-value over-estimation during training. The proposed CQL based approach is evaluated objectively on the IEMOCAP dataset for laughter generation task. When compared to the existing off-policy Q-learning methods, we observe an improved compliance with the dataset in terms of laugh generation rate. Furthermore, we show the effectiveness of the learned policy by estimating the expected engagement using off-policy policy evaluation techniques. | |
dc.description.fulltext | YES | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | TÜBİTAK | |
dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TÜBİTAK) | |
dc.description.version | Author's final manuscript | |
dc.format | ||
dc.identifier.doi | 10.1145/3462244.3479944 | |
dc.identifier.embargo | NO | |
dc.identifier.filenameinventoryno | IR03356 | |
dc.identifier.isbn | 978-1-4503-8481-0 | |
dc.identifier.link | https://doi.org/10.1145/3462244.3479944 | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85119021073 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/4059 | |
dc.keywords | Backchannels | |
dc.keywords | Human-agent interaction | |
dc.keywords | Offline reinforcement learning | |
dc.keywords | User engagement | |
dc.language | English | |
dc.publisher | Association for Computing Machinery (ACM) | |
dc.relation.grantno | 2.17E+42 | |
dc.relation.uri | http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/10144 | |
dc.source | International Conference on Multimodal Interaction | |
dc.subject | Generation | |
dc.title | Engagement rewarded actor-critic with conservative Q-learning for speech-driven laughter backchannel generation | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | N/A | |
local.contributor.authorid | 0000-0002-2715-2368 | |
local.contributor.authorid | 0000-0002-1524-1646 | |
local.contributor.authorid | 0000-0002-7515-3138 | |
local.contributor.kuauthor | Bayramoğlu, Öykü Zeynep | |
local.contributor.kuauthor | Erzin, Engin | |
local.contributor.kuauthor | Sezgin, Tevfik Metin | |
local.contributor.kuauthor | Yemez, Yücel | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae |
Files
Original bundle
1 - 1 of 1