Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition

Placeholder

Organizational Units

Program

KU-Authors

Yüret, Deniz
Sezgin, Tevfik Metin
Soykan, Gürkan

KU Authors

Co-Authors

Advisor

Publication Date

Language

Journal Title

Journal ISSN

Volume Title

Abstract

This study focuses on improving the optical character recognition (OCR) data for panels in COMICS [18], the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for Western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of fine-tuned state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in COMICS. Using the improved text data of COMICS Text+ in the comics processing model from COMICS resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All data, models, and instructions can be accessed online (https://github.com/gsoykan/comics_text_plus).

Source:

DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024 WORKSHOPS, PT I

Publisher:

Springer International Publishing AG

Keywords:

Subject

Computer science

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyrights Note

0

Views

0

Downloads

View PlumX Details