Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorSezgin, Tevfik Metin
dc.contributor.kuauthorSoykan, Gürkan
dc.contributor.kuauthorYüret, Deniz
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2025-03-06T20:58:05Z
dc.date.issued2024
dc.description.abstractThis study focuses on improving the optical character recognition (OCR) data for panels in COMICS [18], the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for Western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of fine-tuned state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in COMICS. Using the improved text data of COMICS Text+ in the comics processing model from COMICS resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All data, models, and instructions can be accessed online (https://github.com/gsoykan/comics_text_plus).
dc.description.indexedbyScopus
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipThis project is supported by KocUniversity and. Is Bank AI Center (KUIS AI). We would like to thank KUIS AI for their support.
dc.identifier.doi10.1007/978-3-031-70645-5_12
dc.identifier.eissn1611-3349
dc.identifier.grantnoKocUniversity
dc.identifier.isbn9783031706448
dc.identifier.isbn9783031706455
dc.identifier.issn0302-9743
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85204597509
dc.identifier.urihttps://doi.org/10.1007/978-3-031-70645-5_12
dc.identifier.urihttps://hdl.handle.net/20.500.14288/27359
dc.identifier.volume14935
dc.identifier.wos1336400200012
dc.keywordsOptical character recognition (OCR)
dc.keywordsText detection
dc.keywordsText recognition
dc.keywordsComic processing
dc.language.isoeng
dc.publisherSpringer International Publishing AG
dc.relation.ispartofDOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024 WORKSHOPS, PT I
dc.subjectComputer science
dc.titleA comprehensive gold standard and benchmark for comics text detection and recognition
dc.typeConference Proceeding
dspace.entity.typePublication
local.publication.orgunit1College of Engineering
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files