Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition

dc.conference.dateAUG 30-SEP 04, 2024
dc.conference.locationAthens, Greece
dc.conference.organizer18th International Conference on Document Analysis and Recognition (ICDAR)
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentKUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.facultymemberYes
dc.contributor.kuauthorSezgin, Tevfik Metin
dc.contributor.kuauthorSoykan, Gürkan
dc.contributor.kuauthorYüret, Deniz
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteResearch Center
dc.date.accessioned2025-03-06T20:58:05Z
dc.date.issued2024
dc.description.abstractThis study focuses on improving the optical character recognition (OCR) data for panels in COMICS [18], the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for Western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of fine-tuned state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in COMICS. Using the improved text data of COMICS Text+ in the comics processing model from COMICS resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All data, models, and instructions can be accessed online (https://github.com/gsoykan/comics_text_plus).
dc.description.fulltextNo
dc.description.harvestedfromManual
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessN/A
dc.description.peerreviewstatusN/A
dc.description.publisherscopeInternational
dc.description.readpublishN/A
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipThis project is supported by KocUniversity and. Is Bank AI Center (KUIS AI). We would like to thank KUIS AI for their support.
dc.description.studentonlypublicationNo
dc.description.studentpublicationYes
dc.description.versionN/A
dc.identifier.doi10.1007/978-3-031-70645-5_12
dc.identifier.eissn1611-3349
dc.identifier.embargoN/A
dc.identifier.endpage197
dc.identifier.isbn9783031706448
dc.identifier.isbn9783031706455
dc.identifier.issn0302-9743
dc.identifier.quartileQ4
dc.identifier.scopus2-s2.0-85204597509
dc.identifier.urihttps://doi.org/10.1007/978-3-031-70645-5_12
dc.identifier.urihttps://hdl.handle.net/20.500.14288/27359
dc.identifier.volume14935
dc.identifier.wos001336400200012
dc.keywordsOptical character recognition (OCR)
dc.keywordsText detection
dc.keywordsText recognition
dc.keywordsComic processing
dc.language.isoeng
dc.publisherSpringer Nature
dc.relation.affiliationKoç University
dc.relation.collectionKoç University Institutional Repository
dc.relation.ispartofDocument Analysis And Recognition-Icdar 2024 Workshops, PT I
dc.relation.openaccessN/A
dc.rightsN/A
dc.subjectComputer science
dc.titleA comprehensive gold standard and benchmark for comics text detection and recognition
dc.typeConference Proceeding
dspace.entity.typePublication
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublicationd437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files