Publication:
CNN-based page segmentation and object classification for counting population in ottoman archival documentation

dc.contributor.departmentDepartment of History
dc.contributor.departmentN/A
dc.contributor.kuauthorKabadayı, Mustafa Erdem
dc.contributor.kuauthorCan, Yekta Said
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileResearcher
dc.contributor.otherDepartment of History
dc.contributor.schoolcollegeinstituteCollege of Social Sciences and Humanities
dc.contributor.schoolcollegeinstituteCollege of Social Sciences and Humanities
dc.contributor.yokid33267
dc.contributor.yokidN/A
dc.date.accessioned2024-11-09T23:43:47Z
dc.date.issued2020
dc.description.abstractHistorical document analysis systems gain importance with the increasing efforts in the digitalization of archives. Page segmentation and layout analysis are crucial steps for such systems. Errors in these steps will affect the outcome of handwritten text recognition and Optical Character Recognition (OCR) methods, which increase the importance of the page segmentation and layout analysis. Degradation of documents, digitization errors, and varying layout styles are the issues that complicate the segmentation of historical documents. The properties of Arabic scripts such as connected letters, ligatures, diacritics, and different writing styles make it even more challenging to process Arabic script historical documents. In this study, we developed an automatic system for counting registered individuals and assigning them to populated places by using a CNN-based architecture. To evaluate the performance of our system, we created a labeled dataset of registers obtained from the first wave of population registers of the Ottoman Empire held between the 1840s and 1860s. We achieved promising results for classifying different types of objects and counting the individuals and assigning them to populated places.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.indexedbyPubMed
dc.description.issue32
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.volume6
dc.identifier.doi10.3390/JIMAGING6050032
dc.identifier.issn2313-433X
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85089909335&doi=10.3390%2fJIMAGING6050032&partnerID=40&md5=3cc7ba2acc257b9b57a27940fa03e1c3
dc.identifier.scopus2-s2.0-85089909335
dc.identifier.urihttps://dx.doi.org/10.3390/JIMAGING6050032
dc.identifier.urihttps://hdl.handle.net/20.500.14288/13556
dc.identifier.wos541022700002
dc.keywordsArabic script layout analysis
dc.keywordsConvolutional neural networks
dc.keywordsHistorical document analysis
dc.keywordsPage segmentation
dc.languageEnglish
dc.publisherMultidisciplinary Digital Publishing Institute (MPDI)
dc.sourceJournal of Imaging
dc.subjectImage processing
dc.subjectPhotography
dc.subjectDigital techniques
dc.titleCNN-based page segmentation and object classification for counting population in ottoman archival documentation
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0003-3206-0190
local.contributor.authorid0000-0002-6614-0183
local.contributor.kuauthorKabadayı, Mustafa Erdem
local.contributor.kuauthorCan, Yekta Said
relation.isOrgUnitOfPublicationbe8432df-d124-44c3-85b4-be586c2db8a3
relation.isOrgUnitOfPublication.latestForDiscoverybe8432df-d124-44c3-85b4-be586c2db8a3

Files