Publication:
Training transformer models by wavelet losses ımproves quantitative and visual performance in single ımage super-resolution

dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorKorkmaz, Cansu
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2025-03-06T20:58:31Z
dc.date.issued2024
dc.description.abstractTransformer-based models have achieved remarkable results in low-level vision tasks including image super-resolution (SR). However, early Transformer-based approaches that rely on self-attention within non-overlapping windows encounter challenges in acquiring global information. To activate more input pixels globally, hybrid attention models have been proposed. Moreover, training by solely minimizing pixel-wise RGB losses, such as l1, have been found inadequate for capturing essential high-frequency details. This paper presents two contributions: i) We introduce convolutional non-local sparse attention (NLSA) blocks to extend the hybrid transformer architecture in order to further enhance its receptive field. ii) We employ wavelet losses to train Transformer models to improve quantitative and subjective performance. While wavelet losses have been explored previously, showing their power in training Transformer-based SR models is novel. Our experimental results demonstrate that the proposed model provides state-of-the-art PSNR results as well as superior visual performance across various benchmark datasets. © 2024 IEEE.
dc.description.indexedbyScopus
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.identifier.doi10.1109/CVPRW63382.2024.00660
dc.identifier.isbn9798350365474
dc.identifier.issn2160-7508
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85192828976
dc.identifier.urihttps://doi.org/10.1109/CVPRW63382.2024.00660
dc.identifier.urihttps://hdl.handle.net/20.500.14288/27483
dc.keywordsSuper-resolution
dc.keywordsTranformer-based Sr
dc.keywordsWavelet loss
dc.language.isoeng
dc.publisherIEEE Computer Society
dc.relation.ispartofIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
dc.subjectElectrical and electronics engineering
dc.subjectComputer engineering
dc.titleTraining transformer models by wavelet losses ımproves quantitative and visual performance in single ımage super-resolution
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorKorkmaz, Cansu
local.contributor.kuauthorTekalp, Ahmet Murat
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit1College of Engineering
local.publication.orgunit2Department of Electrical and Electronics Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
IR05733.pdf
Size:
695.08 KB
Format:
Adobe Portable Document Format