Asymptotic study of in-context learning with random transformers through equivalent models

Publication:
Asymptotic study of in-context learning with random transformers through equivalent models

dc.conference.date	AUG 31-SEP 03, 2025
dc.conference.location	Istanbul
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.department	Department of Electrical and Electronics Engineering
dc.contributor.department	KUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.kuauthor	Demir, Samet
dc.contributor.kuauthor	Doğan, Zafer
dc.contributor.schoolcollegeinstitute	Research Center
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2025-12-31T08:19:08Z
dc.date.available	2025-12-31
dc.date.issued	2025
dc.description.abstract	We study the in-context learning (ICL) capabilities of pretrained Transformers in the setting of nonlinear regression. Specifically, we focus on a random Transformer with a nonlinear MLP head where the first layer is randomly initialized and fixed while the second layer is trained. Furthermore, we consider an asymptotic regime where the context length, input dimension, hidden dimension, number of training tasks, and number of training samples jointly grow. In this setting, we show that the random Transformer behaves equivalent to a finite-degree Hermite polynomial model in terms of ICL error. This equivalence is validated through simulations across varying activation functions, context lengths, hidden layer widths (revealing a double-descent phenomenon), and regularization settings. Our results offer theoretical and empirical insights into when and how MLP layers enhance ICL, and how nonlinearity and over-parameterization influence model performance.
dc.description.fulltext	Yes
dc.description.harvestedfrom	Manual
dc.description.indexedby	Scopus
dc.description.publisherscope	International
dc.description.readpublish	N/A
dc.description.sponsoredbyTubitakEu	TÜBİTAK
dc.description.sponsorship	TÜBİTAK ARDEB 1001 program. S.D. is supported by an AI Fellowship provided by KUIS AI Research Center and a PhD Scholarship (BİDEB 2211) from TÜBİTAK.
dc.identifier.doi	10.1109/MLSP62443.2025.11204336
dc.identifier.embargo	No
dc.identifier.grantno	124E063
dc.identifier.isbn	9798331570293
dc.identifier.issn	2161-0363
dc.identifier.quartile	N/A
dc.identifier.uri	https://doi.org/10.1109/MLSP62443.2025.11204336
dc.identifier.uri	https://hdl.handle.net/20.500.14288/31436
dc.keywords	Deep learning theory
dc.keywords	High-dimensional asymptotics
dc.keywords	In-context learning
dc.keywords	Transformer
dc.language.iso	eng
dc.publisher	IEEE Computer Society
dc.relation.affiliation	Koç University
dc.relation.collection	Koç University Institutional Repository
dc.relation.ispartof	IEEE International Workshop on Machine Learning for Signal Processing, MLSP
dc.relation.openaccess	Yes
dc.rights	CC BY-NC-ND (Attribution-NonCommercial-NoDerivs)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Engineering
dc.title	Asymptotic study of in-context learning with random transformers through equivalent models
dc.type	Conference Proceeding
dspace.entity.type	Publication
person.familyName	Demir
person.familyName	Doğan
person.givenName	Samet
person.givenName	Zafer
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication	21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication	77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isParentOrgUnitOfPublication	d437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	d437580f-9309-4ecb-864a-4af58309d287

Collections

Publications without Fulltext

Publication: Asymptotic study of in-context learning with random transformers through equivalent models

Files

Collections

Publication:
Asymptotic study of in-context learning with random transformers through equivalent models