Research Outputs

Permanent URI for this communityhttps://hdl.handle.net/20.500.14288/2

Browse

Search Results

Now showing 1 - 2 of 2
  • Placeholder
    Publication
    Realtime engagement measurement in human-computer interaction
    (Institute of Electrical and Electronics Engineers Inc., 2020) Department of Computer Engineering; Department of Computer Engineering; Department of Computer Engineering; N/A; N/A; N/A; Sezgin, Tevfik Metin; Yemez, Yücel; Erzin, Engin; Türker, Bekir Berker; Numanoğlu, Tuğçe; Kesim, Ege; Faculty Member; Faculty Member; Faculty Member; PhD Student; Master Student; Master Student; Department of Computer Engineering; College of Engineering; College of Engineering; College of Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; 18632; 107907; 34503; N/A; N/A; N/A
    Social robots are expected to understand their interlocutors and behave accordingly like humans do. Endowing robots with the capability of monitoring user engagement during their interactions with humans is one of the crucial steps towards achieving this goal. In this work, an interactive game is designed and implemented, which is played with a robot. During the interaction, the user engagement is monitored in realtime via detection of user gaze, turn-taking, laughters/smiles and head nods from audio-visual data. In the experiments conducted, the realtime monitored engagement is found to be consistent with the humanannotated engagement levels.
  • Placeholder
    Publication
    Use of affective visual Information for summarization of human-centric videos
    (2022) Kopro, Berkay; Department of Computer Engineering; Erzin, Engin; Faculty Member; Department of Computer Engineering; College of Engineering; 34503
    The increasing volume of user-generated human-centric video content and its applications, such as video retrieval and browsing, require compact representations addressed by the video summarization literature. Current supervised studies formulate video summarization as a sequence-to-sequence learning problem, and the existing solutions often neglect the surge of the human-centric view, which inherently contains affective content. In this study, we investigate the affective-information enriched supervised video summarization task for human-centric videos. First, we train a visual input-driven state-of-the-art continuous emotion recognition model (CER-NET) on the RECOLA dataset to estimate activation and valence attributes. Then, we integrate the estimated emotional attributes and their high-level embeddings from the CER-NET with the visual information to define the proposed affective video summarization (AVSUM) architectures. In addition, we investigate the use of attention to improve the AVSUM architectures and propose two new architectures based on temporal attention (TA-AVSUM) and spatial attention (SA-AVSUM). We conduct video summarization experiments on the TvSum and COGNIMUSE datasets. The proposed temporal attention-based TA-AVSUM architecture attains competitive video summarization performances with strong improvements for the human-centric videos compared to the state-of-the-art in terms of F-score, self-defined face recall, and rank correlation metrics.