Publication:
Fast and interpretable genomic data analysis using multiple approximate kernel learning

dc.contributor.coauthorAk, Ciğdem
dc.contributor.departmentDepartment of Industrial Engineering
dc.contributor.kuauthorGönen, Mehmet
dc.contributor.kuauthorBektaş, Ayyüce Begüm
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Industrial Engineering
dc.contributor.schoolcollegeinstituteSchool of Medicine
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.yokid237468
dc.contributor.yokidN/A
dc.date.accessioned2024-11-09T11:43:13Z
dc.date.issued2022
dc.description.abstractMotivation: dataset sizes in computational biology have been increased drastically with the help of improved data collection tools and increasing size of patient cohorts. Previous kernel-based machine learning algorithms proposed for increased interpretability started to fail with large sample sizes, owing to their lack of scalability. To overcome this problem, we proposed a fast and efficient multiple kernel learning (MKL) algorithm to be particularly used with large-scale data that integrates kernel approximation and group Lasso formulations into a conjoint model. Our method extracts significant and meaningful information from the genomic data while conjointly learning a model for out-of-sample prediction. It is scalable with increasing sample size by approximating instead of calculating distinct kernel matrices. Results: to test our computational framework, namely, Multiple Approximate Kernel Learning (MAKL), we demonstrated our experiments on three cancer datasets and showed that MAKL is capable to outperform the baseline algorithm while using only a small fraction of the input features. We also reported selection frequencies of approximated kernel matrices associated with feature subsets (i.e. gene sets/pathways), which helps to see their relevance for the given classification task. Our fast and interpretable MKL algorithm producing sparse solutions is promising for computational biology applications considering its scalability and highly correlated structure of genomic datasets, and it can be used to discover new biomarkers and new therapeutic guidelines.
dc.description.fulltextYES
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.indexedbyPubMed
dc.description.issueSup-1
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TÜBİTAK)
dc.description.sponsorshipTurkish Academy of Sciences (TÜBA) Young Outstanding Researcher Support Programme (GEBIP)
dc.description.sponsorshipScience Academy of Turkey (BAGEP The Young Scientist Award Program)
dc.description.versionPublisher version
dc.description.volume38
dc.formatpdf
dc.identifier.doi10.1093/bioinformatics/btac241
dc.identifier.eissn1460-2059
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR03781
dc.identifier.issn1367-4803
dc.identifier.linkhttps://doi.org/10.1093/bioinformatics/btac241
dc.identifier.quartileQ1
dc.identifier.scopus2-s2.0-85133882826
dc.identifier.urihttps://hdl.handle.net/20.500.14288/309
dc.identifier.wos817250400014
dc.keywordsVarying coefficient model
dc.keywordsQuantile regression
dc.keywordsHigh-dimensional
dc.languageEnglish
dc.publisherOxford University Press (OUP)
dc.relation.grantnoEEEAG 117E181
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/10639
dc.sourceBioinformatics
dc.subjectBiochemical research methods
dc.subjectBiotechnology and applied microbiology
dc.subjectComputer science, interdisciplinary applications
dc.subjectMathematical and computational biology
dc.subjectStatistics and probability
dc.titleFast and interpretable genomic data analysis using multiple approximate kernel learning
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0002-2483-075X
local.contributor.authoridN/A
local.contributor.kuauthorGönen, Mehmet
local.contributor.kuauthorBektaş, Ayyüce Begüm
relation.isOrgUnitOfPublicationd6d00f52-d22d-4653-99e7-863efcd47b4a
relation.isOrgUnitOfPublication.latestForDiscoveryd6d00f52-d22d-4653-99e7-863efcd47b4a

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
10639.pdf
Size:
530.6 KB
Format:
Adobe Portable Document Format