Publication:
On the rate of convergence of a classifier based on a transformer encoder

Placeholder

Departments

School / College / Institute

Program

KU Authors

Co-Authors

Gurevych, Iryna
Kohler, Michael

Publication Date

Language

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.

Source

Publisher

IEEE-Inst Electrical Electronics Engineers Inc

Subject

Computer science, Information technology, Information science, Electrical electronics engineerings engineering

Citation

Has Part

Source

IEEE Transactions on Information Theory

Book Series Title

Edition

DOI

10.1109/TIT.2022.3191747

item.page.datauri

Link

Rights

Copyrights Note

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads

View PlumX Details