A comprehensive gold standard and benchmark for comics text detection and recognition

Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition

Departments

Organizational Unit

Department of Computer Engineering

Organizational Unit

KUIS AI (Koç University & İş Bank Artificial Intelligence Center)

School / College / Institute

Organizational Unit

College of Engineering

Organizational Unit

Research Center

KU-Authors

Sezgin, Tevfik Metin

Soykan, Gürkan

Yüret, Deniz

Date

2024

Type

Conference Proceeding

Embargo Status

N/A

Abstract

This study focuses on improving the optical character recognition (OCR) data for panels in COMICS [18], the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for Western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of fine-tuned state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in COMICS. Using the improved text data of COMICS Text+ in the comics processing model from COMICS resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All data, models, and instructions can be accessed online (https://github.com/gsoykan/comics_text_plus).

Publisher

Springer Nature

Subject

Computer science

Source

Document Analysis And Recognition-Icdar 2024 Workshops, PT I

DOI

10.1007/978-3-031-70645-5_12

URI

https://doi.org/10.1007/978-3-031-70645-5_12
https://hdl.handle.net/20.500.14288/27359

Publication: A comprehensive gold standard and benchmark for comics text detection and recognition

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Editor & Affiliation

Compiler & Affiliation

Translator

Other Contributor

Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

1

Views

0

Downloads

Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition