A comprehensive gold standard and benchmark for comics text detection and recognition

Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition

dc.conference.date	AUG 30-SEP 04, 2024
dc.conference.location	Athens, Greece
dc.conference.organizer	18th International Conference on Document Analysis and Recognition (ICDAR)
dc.contributor.department	Department of Computer Engineering
dc.contributor.department	KUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.facultymember	Yes
dc.contributor.kuauthor	Sezgin, Tevfik Metin
dc.contributor.kuauthor	Soykan, Gürkan
dc.contributor.kuauthor	Yüret, Deniz
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	Research Center
dc.date.accessioned	2025-03-06T20:58:05Z
dc.date.issued	2024
dc.description.abstract	This study focuses on improving the optical character recognition (OCR) data for panels in COMICS [18], the largest dataset containing text and images from comic books. To do this, we developed a pipeline for OCR processing and labeling of comic books and created the first text detection and recognition datasets for Western comics, called "COMICS Text+: Detection" and "COMICS Text+: Recognition". We evaluated the performance of fine-tuned state-of-the-art text detection and recognition models on these datasets and found significant improvement in word accuracy and normalized edit distance compared to the text in COMICS. We also created a new dataset called "COMICS Text+", which contains the extracted text from the textboxes in COMICS. Using the improved text data of COMICS Text+ in the comics processing model from COMICS resulted in state-of-the-art performance on cloze-style tasks without changing the model architecture. The COMICS Text+ can be a valuable resource for researchers working on tasks including text detection, recognition, and high-level processing of comics, such as narrative understanding, character relations, and story generation. All data, models, and instructions can be accessed online (https://github.com/gsoykan/comics_text_plus).
dc.description.fulltext	No
dc.description.harvestedfrom	Manual
dc.description.indexedby	WOS
dc.description.indexedby	Scopus
dc.description.openaccess	N/A
dc.description.peerreviewstatus	N/A
dc.description.publisherscope	International
dc.description.readpublish	N/A
dc.description.sponsoredbyTubitakEu	N/A
dc.description.sponsorship	This project is supported by KocUniversity and. Is Bank AI Center (KUIS AI). We would like to thank KUIS AI for their support.
dc.description.studentonlypublication	No
dc.description.studentpublication	Yes
dc.description.version	N/A
dc.identifier.doi	10.1007/978-3-031-70645-5_12
dc.identifier.eissn	1611-3349
dc.identifier.embargo	N/A
dc.identifier.endpage	197
dc.identifier.isbn	9783031706448
dc.identifier.isbn	9783031706455
dc.identifier.issn	0302-9743
dc.identifier.quartile	Q4
dc.identifier.scopus	2-s2.0-85204597509
dc.identifier.uri	https://doi.org/10.1007/978-3-031-70645-5_12
dc.identifier.uri	https://hdl.handle.net/20.500.14288/27359
dc.identifier.volume	14935
dc.identifier.wos	001336400200012
dc.keywords	Optical character recognition (OCR)
dc.keywords	Text detection
dc.keywords	Text recognition
dc.keywords	Comic processing
dc.language.iso	eng
dc.publisher	Springer Nature
dc.relation.affiliation	Koç University
dc.relation.collection	Koç University Institutional Repository
dc.relation.ispartof	Document Analysis And Recognition-Icdar 2024 Workshops, PT I
dc.relation.openaccess	N/A
dc.rights	N/A
dc.subject	Computer science
dc.title	A comprehensive gold standard and benchmark for comics text detection and recognition
dc.type	Conference Proceeding
dspace.entity.type	Publication
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	d437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Collections

Publications without Fulltext

Publication: A comprehensive gold standard and benchmark for comics text detection and recognition

Files

Collections

Publication:
A comprehensive gold standard and benchmark for comics text detection and recognition