Publication: On the rate of convergence of a classifier based on a transformer encoder
Program
KU-Authors
KU Authors
Co-Authors
Gurevych, Iryna
Kohler, Michael
Publication Date
Language
Type
Embargo Status
Journal Title
Journal ISSN
Volume Title
Alternative Title
Abstract
Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.
Source
Publisher
IEEE-Inst Electrical Electronics Engineers Inc
Subject
Computer science, Information technology, Information science, Electrical electronics engineerings engineering
Citation
Has Part
Source
IEEE Transactions on Information Theory
Book Series Title
Edition
DOI
10.1109/TIT.2022.3191747