Classification of patients with chronic disease by activation level using machine learning methods

dc.contributor.authorid0000-0002-9959-6240
dc.contributor.authorid0000-0002-9924-3744
dc.contributor.coauthorDemiray, Onur
dc.contributor.coauthorKulak, Ercan
dc.contributor.coauthorDogan, Emrah
dc.contributor.coauthorKaraketir, Seyma Gorcin
dc.contributor.coauthorCifcili, Serap
dc.contributor.coauthorAkman, Mehmet
dc.contributor.departmentN/A
dc.contributor.departmentDepartment of Business Administration
dc.contributor.kuauthorSakarya, Sibel
dc.contributor.kuauthorGüneş, Evrim Didem
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileFaculty Member
dc.contributor.schoolcollegeinstituteSchool of Medicine
dc.contributor.schoolcollegeinstituteCollege of Administrative Sciences and Economics
dc.contributor.yokid172028
dc.contributor.yokid51391
dc.date.accessioned2025-01-19T10:31:15Z
dc.date.issued2023
dc.description.abstractPatient Activation Measure (PAM) measures the activation level of patients with chronic conditions and correlates well with patient adherence behavior, health outcomes, and healthcare costs. PAM is increasingly used in practice to identify patients needing more support from the care team. We define PAMlevels 1 and 2 as low PAM and investigate the performance of eight machine learning methods (Logistic Regression, Lasso Regression, Ridge Regression, Random Forest, Gradient Boosted Trees, Support Vector Machines, Decision Trees, Neural Networks) to classify patients. Primary data collected from adult patients (n=431) with Diabetes Mellitus (DM) or Hypertension (HT) attending Family Health Centers in Istanbul, Turkey, is used to test the methods. 44.5% of patients in the dataset have a low PAM level. Classification performance with several feature sets was analyzed to understand the relative importance of different types of information and provide insights. The most important features are found as whether the patient performs self-monitoring, smoking and exercise habits, education, and socio-economic status. The best performance was achieved with the Logistic Regression algorithm, with Area Under the Curve (AUC)=0.72 with the best performing feature set. Alternative feature sets with similar prediction performance are also presented. The prediction performance was inferior with an automated feature selection method, supporting the importance of using domain knowledge in machine learning.
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.indexedbyPubMed
dc.description.issue4
dc.description.publisherscopeInternational
dc.description.sponsorsWe sincerely thank the reviewers for their constructive comments that significantly improved this paper. We are grateful to the AXA Research Fund for the financial support provided through the AXA Award granted to the second author.
dc.description.volume26
dc.identifier.doi10.1007/s10729-023-09653-4
dc.identifier.eissn1572-9389
dc.identifier.issn1386-9620
dc.identifier.quartileQ2
dc.identifier.scopus2-s2.0-85174038437
dc.identifier.urihttps://doi.org/10.1007/s10729-023-09653-4
dc.identifier.urihttps://hdl.handle.net/20.500.14288/26204
dc.identifier.wos1079587100001
dc.keywordsPatient activation
dc.keywordsPatient activation measure
dc.keywordsChronic care
dc.keywordsPrimary care
dc.keywordsMachine learning
dc.keywordsBinary classification
dc.keywordsLogistic regression
dc.keywordsPrediction
dc.languageen
dc.publisherSpringer
dc.relation.grantnoWe sincerely thank the reviewers for their constructive comments that significantly improved this paper. We are grateful to the AXA Research Fund for the financial support provided through the AXA Award granted to the second author.; AXA Research Fund; AXA Award
dc.sourceHealth Care Management Science
dc.subjectHealth policy
dc.subjectServices
dc.titleClassification of patients with chronic disease by activation level using machine learning methods
dc.typeJournal Article

Files