Publication: Training transformer models by wavelet losses ımproves quantitative and visual performance in single ımage super-resolution
dc.contributor.department | Department of Electrical and Electronics Engineering | |
dc.contributor.department | Graduate School of Sciences and Engineering | |
dc.contributor.kuauthor | Korkmaz, Cansu | |
dc.contributor.kuauthor | Tekalp, Ahmet Murat | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
dc.date.accessioned | 2025-03-06T20:58:31Z | |
dc.date.issued | 2024 | |
dc.description.abstract | Transformer-based models have achieved remarkable results in low-level vision tasks including image super-resolution (SR). However, early Transformer-based approaches that rely on self-attention within non-overlapping windows encounter challenges in acquiring global information. To activate more input pixels globally, hybrid attention models have been proposed. Moreover, training by solely minimizing pixel-wise RGB losses, such as l1, have been found inadequate for capturing essential high-frequency details. This paper presents two contributions: i) We introduce convolutional non-local sparse attention (NLSA) blocks to extend the hybrid transformer architecture in order to further enhance its receptive field. ii) We employ wavelet losses to train Transformer models to improve quantitative and subjective performance. While wavelet losses have been explored previously, showing their power in training Transformer-based SR models is novel. Our experimental results demonstrate that the proposed model provides state-of-the-art PSNR results as well as superior visual performance across various benchmark datasets. © 2024 IEEE. | |
dc.description.indexedby | Scopus | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | N/A | |
dc.identifier.doi | 10.1109/CVPRW63382.2024.00660 | |
dc.identifier.isbn | 9798350365474 | |
dc.identifier.issn | 2160-7508 | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85192828976 | |
dc.identifier.uri | https://doi.org/10.1109/CVPRW63382.2024.00660 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/27483 | |
dc.keywords | Super-resolution | |
dc.keywords | Tranformer-based Sr | |
dc.keywords | Wavelet loss | |
dc.language.iso | eng | |
dc.publisher | IEEE Computer Society | |
dc.relation.ispartof | IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops | |
dc.subject | Electrical and electronics engineering | |
dc.subject | Computer engineering | |
dc.title | Training transformer models by wavelet losses ımproves quantitative and visual performance in single ımage super-resolution | |
dc.type | Conference Proceeding | |
dspace.entity.type | Publication | |
local.contributor.kuauthor | Korkmaz, Cansu | |
local.contributor.kuauthor | Tekalp, Ahmet Murat | |
local.publication.orgunit1 | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
local.publication.orgunit1 | College of Engineering | |
local.publication.orgunit2 | Department of Electrical and Electronics Engineering | |
local.publication.orgunit2 | Graduate School of Sciences and Engineering | |
relation.isOrgUnitOfPublication | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isOrgUnitOfPublication | 3fc31c89-e803-4eb1-af6b-6258bc42c3d8 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isParentOrgUnitOfPublication | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 | |
relation.isParentOrgUnitOfPublication | 434c9663-2b11-4e66-9399-c863e2ebae43 | |
relation.isParentOrgUnitOfPublication.latestForDiscovery | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 |
Files
Original bundle
1 - 1 of 1