Publication: Optimal nonlinearities improve generalization performance of random features
dc.contributor.department | Department of Electrical and Electronics Engineering | |
dc.contributor.department | Graduate School of Sciences and Engineering | |
dc.contributor.department | KUIS AI (Koç University & İş Bank Artificial Intelligence Center) | |
dc.contributor.kuauthor | Demir, Samet | |
dc.contributor.kuauthor | Doğan, Zafer | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
dc.contributor.schoolcollegeinstitute | Research Center | |
dc.date.accessioned | 2024-12-29T09:36:26Z | |
dc.date.issued | 2023 | |
dc.description.abstract | Random feature model with a nonlinear activation function has been shown to perform asymptotically equivalent to a Gaussian model in terms of training and generalization errors. Analysis of the equivalent model reveals an important yet not fully understood role played by the activation function. To address this issue, we study the "parameters" of the equivalent model to achieve improved generalization performance for a given supervised learning problem. We show that acquired parameters from the Gaussian model enable us to define a set of optimal nonlinearities. We provide two example classes from this set, e.g., second-order polynomial and piecewise linear functions. These functions are optimized to improve generalization performance regardless of the actual form. We experiment with regression and classification problems, including synthetic and real (e.g., CIFAR10) data. Our numerical results validate that the optimized nonlinearities achieve better generalization performance than widely-used nonlinear functions such as ReLU. Furthermore, we illustrate that the proposed nonlinearities also mitigate the so-called double descent phenomenon, which is known as the non-monotonic generalization performance regarding the sample size and the model size. | |
dc.description.indexedby | WOS | |
dc.description.indexedby | Scopus | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | TÜBİTAK | |
dc.description.sponsorship | We acknowledge that this work was supported in part by TUBITAK 2232 International Fellowship for Outstanding Researchers Award (No. 118C337) and an AI Fellowship provided by Koc University &. Is Bank Artificial Intelligence (KUIS AI) Research Center. | |
dc.description.volume | 222 | |
dc.identifier.issn | 2640-3498 | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85189637296 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/22063 | |
dc.identifier.wos | 1221095300017 | |
dc.keywords | Random feature model | |
dc.keywords | Generalization performance | |
dc.keywords | Activation functions | |
dc.keywords | Gaussian equivalence conjecture | |
dc.keywords | Universality | |
dc.keywords | Double descent phenomenon | |
dc.language.iso | eng | |
dc.publisher | JMLR-Jornal Machine Learning Research | |
dc.relation.ispartof | Asian Conference on Machine Learning Vol 222 | |
dc.subject | Computer science, artificial intelligence | |
dc.subject | Computer science, theory and methods | |
dc.subject | Statistics and probability | |
dc.title | Optimal nonlinearities improve generalization performance of random features | |
dc.type | Conference Proceeding | |
dspace.entity.type | Publication | |
local.contributor.kuauthor | Demir, Samet | |
local.contributor.kuauthor | Doğan, Zafer | |
local.publication.orgunit1 | College of Engineering | |
local.publication.orgunit1 | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
local.publication.orgunit1 | Research Center | |
local.publication.orgunit2 | Department of Electrical and Electronics Engineering | |
local.publication.orgunit2 | KUIS AI (Koç University & İş Bank Artificial Intelligence Center) | |
local.publication.orgunit2 | Graduate School of Sciences and Engineering | |
relation.isOrgUnitOfPublication | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isOrgUnitOfPublication | 3fc31c89-e803-4eb1-af6b-6258bc42c3d8 | |
relation.isOrgUnitOfPublication | 77d67233-829b-4c3a-a28f-bd97ab5c12c7 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isParentOrgUnitOfPublication | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 | |
relation.isParentOrgUnitOfPublication | 434c9663-2b11-4e66-9399-c863e2ebae43 | |
relation.isParentOrgUnitOfPublication | d437580f-9309-4ecb-864a-4af58309d287 | |
relation.isParentOrgUnitOfPublication.latestForDiscovery | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 |
Files
Original bundle
1 - 1 of 1