Publication:
Classification of imbalanced data with a geometric digraph family

dc.contributor.departmentDepartment of Mathematics
dc.contributor.kuauthorManukyan, Artur
dc.contributor.kuauthorCeyhan, Elvan
dc.contributor.kuprofilePhD Student
dc.contributor.kuprofileUndergraduate Student
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Mathematics
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteCollege of Sciences
dc.date.accessioned2024-11-09T12:26:35Z
dc.date.issued2016
dc.description.abstractWe use a geometric digraph family called class cover catch digraphs (CCCDs) to tackle the class imbalance problem in statistical classification. CCCDs provide graph theoretic solutions to the class cover problem and have been employed in classification. We assess the classification performance of CCCD classifiers by extensive Monte Carlo simulations, comparing them with other classifiers commonly used in the literature. In particular, we show that CCCD classifiers perform relatively well when one class is more frequent than the other in a two-class setting, an example of the cl ass imbalance problem. We also point out the relationship between class imbalance and class overlapping problems, and their influence on the performance of CCCD classifiers and other classification methods as well as some state-of-the-art algorithms which are robust to class imbalance by construction. Experiments on both simulated and real data sets indicate that CCCD classifiers are robust to the class imbalance problem. CCCDs substantially undersample from the majority class while preserving the information on the discarded points during the undersampling process. Many state-of-the-art methods, however, keep this information by means of ensemble classifiers, but CCCDs yield only a single classifier with the same property, making it both appealing and fast.
dc.description.fulltextYES
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuEU
dc.description.sponsorshipEuropean Commission under the Marie Curie International Outgoing Fellowship Programme
dc.description.versionPublisher version
dc.description.volume17
dc.formatpdf
dc.identifier.eissn1533-7928
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR00502
dc.identifier.issn1532-4435
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-84995460924
dc.identifier.urihttps://hdl.handle.net/20.500.14288/1694
dc.identifier.wos391826000001
dc.keywordsClass cover catch digraphs
dc.keywordsClass cover problem
dc.keywordsClass imbalance problem
dc.keywordsClass overlapping problem
dc.keywordsGraph domination
dc.keywordsPrototype selection
dc.keywordsSupport estimation
dc.languageEnglish
dc.publisherJournal of Machine Learning Research (JMLR)
dc.relation.grantno329370
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/566
dc.sourceJournal of Machine Learning Research
dc.subjectComputer science
dc.subjectAutomation and control systems
dc.subjectArtificial intelligence
dc.titleClassification of imbalanced data with a geometric digraph family
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.kuauthorManukyan, Artur
local.contributor.kuauthorCeyhan, Elvan
relation.isOrgUnitOfPublication2159b841-6c2d-4f54-b1d4-b6ba86edfdbe
relation.isOrgUnitOfPublication.latestForDiscovery2159b841-6c2d-4f54-b1d4-b6ba86edfdbe

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
566.pdf
Size:
934.04 KB
Format:
Adobe Portable Document Format