Department of Computer Engineering2024-11-0920199781-7281-3888-610.1109/ACII.2019.89254432-s2.0-85077800470http://dx.doi.org/10.1109/ACII.2019.8925443https://hdl.handle.net/20.500.14288/9236The ability to generate appropriate verbal and nonverbal backchannels by an agent during human-robot interaction greatly enhances the interaction experience. Backchannels are particularly important in applications like tutoring and counseling, which require constant attention and engagement of the user. We present here a method for training a robot for backchannel generation during a human-robot interaction within the reinforcement learning (RL) framework, with the goal of maintaining high engagement level. Since online learning by interaction with a human is highly time-consuming and impractical, we take advantage of the recorded human-to-human dataset and approach our problem as a batch reinforcement learning problem. The dataset is utilized as a batch data acquired by some behavior policy. We perform experiments with laughs as a backchannel and train an agent with value-based techniques. In particular, we demonstrate the effectiveness of recurrent layers in the approximate value function for this problem, that boosts the performance in partially observable environments. With off-policy policy evaluation, it is shown that the RL agents are expected to produce more engagement than an agent trained from imitation learning.Computer scienceArtificial intelligenceInformation systemsEngineeringElectrical and electronic engineeringBatch recurrent Q-Learning for backchannel generation towards engaging agentsConference proceedinghttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85077800470&doi=10.1109%2fACII.2019.8925443&partnerID=40&md5=bd33450a13412b555157995e032884e05222208000583659