ComicBERT: A transformer model and pre-training strategy for contextual understanding in comics

Publication:
ComicBERT: A transformer model and pre-training strategy for contextual understanding in comics

Departments

Organizational Unit

Department of Computer Engineering

Organizational Unit

Graduate School of Sciences and Engineering

School / College / Institute

Organizational Unit

College of Engineering

Organizational Unit

GRADUATE SCHOOL OF SCIENCES AND ENGINEERING

Upper Org Unit

KU-Authors

Sezgin, Tevfik Metin

Soykan, Gürkan

Yüret, Deniz

Publication Date

2024

Type

Conference Proceeding

Abstract

Despite the growing interest in digital comic processing, foundational models tailored for this medium still need to be explored. Existing methods employ multimodal sequential models with cloze-style tasks, but they fall short of achieving human-like understanding. Addressing this gap, we introduce a novel transformer-based architecture, Comicsformer, and a comprehensive framework, ComicBERT, designed to process and understand the complex interplay of visual and textual elements in comics. Our approach utilizes a self-supervised objective, Masked Comic Modeling, inspired by BERT's [6] masked language modeling objective, to train the foundation model. To fine-tune and validate our models, we adopt existing cloze-style tasks and propose new tasks - such as scene-cloze, which better capture the narrative and contextual intricacies unique to comics. Preliminary experiments indicate that these tasks enhance the model's predictive accuracy and may provide new tools for comic creators, aiding in character dialogue generation and panel sequencing. Ultimately, ComicBERT aims to serve as a universal comic processor.

Publisher

Springer International Publishing AG

Subject

Computer science

Source

DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024 WORKSHOPS, PT I

DOI

10.1007/978-3-031-70645-5_16

URI

https://doi.org/10.1007/978-3-031-70645-5_16
https://hdl.handle.net/20.500.14288/27360

Publication:
ComicBERT: A transformer model and pre-training strategy for contextual understanding in comics

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

5

Views

0

Downloads

Publication: ComicBERT: A transformer model and pre-training strategy for contextual understanding in comics

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

5

Views

0

Downloads

Publication:
ComicBERT: A transformer model and pre-training strategy for contextual understanding in comics