Publication: Evaluation of biostatistics contents in ChatGPT: a descriptive study
Program
KU-Authors
Baygül, Arzu Eden
Veznikli, Mert
KU Authors
Co-Authors
Advisor
Publication Date
Language
Journal Title
Journal ISSN
Volume Title
Abstract
This study aims to evaluate the reliability and quality of ChatGPT within the context of biostatistics. The findings will enlighten researchers and clinicians about the advantages and limitations of employing ChatGPT for biostatistical information. It is important to note that this study does not extensively assess advanced biostatistical methods but rather focuses on the question: "Can researchers/clinicians dependably and effortlessly use ChatGPT?" ChatGPT was presented with Frequently Asked Questions (FAQ) in biostatistics, and responses to 20 questions were blindly evaluated by three biostatisticians holding PhDs for reliability and quality. Ratings were based on a reliability score (1 to 7), Global Quality Scale (GQS) (1 to 5), Flesch Reading Ease Score (FRES), and the Intraclass Correlation Coefficient (ICC). Moderate ICC values were observed between raters for reliability (0.646) and GQS (0.545), with a significant correlation between the reliability score and GQS (r=0.708;p<0.001). While ChatGPT provided reliable, high-quality content in response to biostatistics FAQs, it is noted that it cannot replace biostatistics experts. The readability of the content was generally challenging (FRES score: 17.2±12.04). ChatGPT shows promise as a supplementary tool for accessing biostatistics information but should be used alongside human expertise. Future research could explore ways to enhance its readability and compare its performance with alternative sources.
Source:
Proceedings of the International Conference on Statistics
Publisher:
Avestia Publishing
Keywords:
Subject
Artificial intelligence, Statistical analysis