Publication:
Systematic comparison of GPT models for the analysis of pathology reports in a low-resource language: a case study for Turkish

dc.contributor.coauthorDilbaz, Omer Faruk
dc.contributor.departmentSchool of Medicine
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentKUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.departmentKUTTAM (Koç University Research Center for Translational Medicine)
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorBolat, Beyza
dc.contributor.kuauthorDemir, Çiğdem Gündüz
dc.contributor.kuauthorKulaç, İbrahim
dc.contributor.kuauthorÖzateş, Muhammet Nusret
dc.contributor.schoolcollegeinstituteSCHOOL OF MEDICINE
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2025-12-31T08:24:36Z
dc.date.available2025-12-31
dc.date.issued2025
dc.description.abstractObjective Large language models (LLMs) can process text for various applications, including surgical pathology reports, but studies primarily focus on English. Their performance has not been systematically studied for a low-resource language. To analyze the performance of various LLMs, 759 Turkish pathology reports from 5 different procedures were selected.Methods We used 10 examples from every procedure to optimize prompts for OpenAI's GPT-3.5 Turbo, GPT-4o mini, and GPT-4o. The rest was used to test generalizability.Results The GPT-4o model performed superior in processing Turkish reports (12%-25% over GPT-3.5 Turbo, 3%-16% over GPT-4o mini). English-translated versions of the reports have been demonstrated to enhance accuracy, especially for GPT-3.5 Turbo and GPT-4o mini. GPT4-o showed comparable results for Turkish and English. A 12% to 22% performance gap was observed between GPT-4o and GPT-3.5 Turbo for English-translated reports. Domain-related tips in prompts increased accuracy. Results of larger test sets were parallel for all models with the validation set. The GPT-4o model yielded the most accurate results, while the GPT-4o mini model demonstrated intermediate performance. The GPT-3.5 Turbo model exhibited the least accuracy.Conclusions To our knowledge, for the first time in the literature, we have demonstrated the performance of GPT models in Turkish surgical pathology reports, and results indicate that data extracted by GPT-4o are almost ready for direct application.
dc.description.fulltextYes
dc.description.harvestedfromManual
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.indexedbyPubMed
dc.description.publisherscopeInternational
dc.description.readpublishN/A
dc.description.sponsoredbyTubitakEuN/A
dc.identifier.doi10.1093/ajcp/aqaf091
dc.identifier.eissn1943-7722
dc.identifier.embargoNo
dc.identifier.issn0002-9173
dc.identifier.pubmed40971916
dc.identifier.quartileQ3
dc.identifier.scopus2-s2.0-105022413328
dc.identifier.urihttps://doi.org/10.1093/ajcp/aqaf091
dc.identifier.urihttps://hdl.handle.net/20.500.14288/31805
dc.identifier.wos001574536000001
dc.keywordsLLM
dc.keywordsGPT-4o
dc.keywordsPathology
dc.keywordsData extraction
dc.language.isoeng
dc.publisherOXFORD UNIV PRESS INC
dc.relation.affiliationKoç University
dc.relation.collectionKoç University Institutional Repository
dc.relation.ispartofAmerican Journal of Clinical Pathology
dc.relation.openaccessYes
dc.rightsCC BY-NC-ND (Attribution-NonCommercial-NoDerivs)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectPathology
dc.titleSystematic comparison of GPT models for the analysis of pathology reports in a low-resource language: a case study for Turkish
dc.typeJournal Article
dspace.entity.typePublication
person.familyNameBolat
person.familyNameDemir
person.familyNameKulaç
person.familyNameÖzateş
person.givenNameBeyza
person.givenNameÇiğdem Gündüz
person.givenNameİbrahim
person.givenNameMuhammet Nusret
relation.isOrgUnitOfPublicationd02929e1-2a70-44f0-ae17-7819f587bedd
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication91bbe15d-017f-446b-b102-ce755523d939
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscoveryd02929e1-2a70-44f0-ae17-7819f587bedd
relation.isParentOrgUnitOfPublication17f2dc8e-6e54-4fa8-b5e0-d6415123a93e
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery17f2dc8e-6e54-4fa8-b5e0-d6415123a93e

Files