Use of affective visual Information for summarization of human-centric videos

Publication:
Use of affective visual Information for summarization of human-centric videos

dc.contributor.coauthor	Kopro, Berkay
dc.contributor.department	Department of Computer Engineering
dc.contributor.kuauthor	Erzin, Engin
dc.contributor.kuprofile	Faculty Member
dc.contributor.other	Department of Computer Engineering
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.yokid	34503
dc.date.accessioned	2024-11-09T23:25:01Z
dc.date.issued	2022
dc.description.abstract	The increasing volume of user-generated human-centric video content and its applications, such as video retrieval and browsing, require compact representations addressed by the video summarization literature. Current supervised studies formulate video summarization as a sequence-to-sequence learning problem, and the existing solutions often neglect the surge of the human-centric view, which inherently contains affective content. In this study, we investigate the affective-information enriched supervised video summarization task for human-centric videos. First, we train a visual input-driven state-of-the-art continuous emotion recognition model (CER-NET) on the RECOLA dataset to estimate activation and valence attributes. Then, we integrate the estimated emotional attributes and their high-level embeddings from the CER-NET with the visual information to define the proposed affective video summarization (AVSUM) architectures. In addition, we investigate the use of attention to improve the AVSUM architectures and propose two new architectures based on temporal attention (TA-AVSUM) and spatial attention (SA-AVSUM). We conduct video summarization experiments on the TvSum and COGNIMUSE datasets. The proposed temporal attention-based TA-AVSUM architecture attains competitive video summarization performances with strong improvements for the human-centric videos compared to the state-of-the-art in terms of F-score, self-defined face recall, and rank correlation metrics.
dc.description.indexedby	Scopus
dc.description.indexedby	WoS
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.identifier.doi	10.1109/TAFFC.2022.3222882
dc.identifier.issn	1949-3045
dc.identifier.link	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85142815204&doi=10.1109%2fTAFFC.2022.3222882&partnerID=40&md5=689b8b9552de8c666e1c75fe46476e79
dc.identifier.scopus	2-s2.0-85142815204
dc.identifier.uri	https://dx.doi.org/10.1109/TAFFC.2022.3222882
dc.identifier.uri	https://hdl.handle.net/20.500.14288/11299
dc.identifier.wos	1124163900041
dc.keywords	Affective computing
dc.keywords	Continuous emotion recognition
dc.keywords	Neural networks
dc.keywords	Video summarization
dc.language	English
dc.source	IEEE Transactions on Affective Computing
dc.subject	Human-computer interaction
dc.subject	User interfaces (Computer systems)
dc.subject	Artificial intelligence
dc.subject	Computer networks
dc.subject	Video recording
dc.subject	Digital video\|
dc.title	Use of affective visual Information for summarization of human-centric videos
dc.title.alternative	由难民同伴提供以减轻荷兰境内成年叙利亚难民心理困扰的心理干预的有效性：研究方案; Efectividad de una intervención psicológica brindada por un refugiado aotro para reducir el malestar psicológico entre refugiados Sirios en los Países Bajos: estudio piloto
dc.type	Journal Article
dspace.entity.type	Publication
local.contributor.authorid	0000-0002-2715-2368
local.contributor.kuauthor	Erzin, Engin
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae

Collections

Publications without Fulltext

Publication: Use of affective visual Information for summarization of human-centric videos

Files

Collections

Publication:
Use of affective visual Information for summarization of human-centric videos