Publication:
On the rate of convergence of a classifier based on a transformer encoder

dc.contributor.coauthorGurevych, Iryna
dc.contributor.coauthorKohler, Michael
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorŞahin, Gözde Gül
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid366984
dc.date.accessioned2024-11-09T13:44:43Z
dc.date.issued2022
dc.description.abstractPattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.
dc.description.fulltextYES
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.issue12
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipN/A
dc.description.versionAuthor's final manuscript
dc.description.volume68
dc.formatpdf
dc.identifier.doi10.1109/tit.2022.3191747
dc.identifier.eissn1557-9654
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR03951
dc.identifier.issn0018-9448
dc.identifier.linkhttps://doi.org/10.1109/tit.2022.3191747
dc.identifier.quartileQ3
dc.identifier.scopus2-s2.0-85135219411
dc.identifier.urihttps://hdl.handle.net/20.500.14288/3526
dc.identifier.wos891796100027
dc.keywordsCurse of dimensionality
dc.keywordsTransformer
dc.keywordsClassification
dc.keywordsRate of convergence
dc.languageEnglish
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.relation.grantnoNA
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/10818
dc.sourceIEEE Transactions on Information Theory
dc.subjectComputer science
dc.subjectEngineering
dc.subjectInformation systems
dc.titleOn the rate of convergence of a classifier based on a transformer encoder
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0002-0332-1657
local.contributor.kuauthorŞahin, Gözde Gül
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
10818.pdf
Size:
448.9 KB
Format:
Adobe Portable Document Format