Publication:
End to end rate distortion optimized learned hierarchical bi-directional video compression

dc.contributor.departmentDepartment of Electrical and Electronics Engineering
dc.contributor.kuauthorTekalp, Ahmet Murat
dc.contributor.kuauthorYılmaz, Mustafa Akın
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Electrical and Electronics Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokid26207
dc.contributor.yokidN/A
dc.date.accessioned2024-11-09T13:25:51Z
dc.date.issued2022
dc.description.abstractConventional video compression (VC) methods are based on motion compensated transform coding, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to the combinatorial nature of the end-to-end optimization problem. Learned VC allows end-to-end rate-distortion (R-D) optimized training of nonlinear transform, motion and entropy model simultaneously. Most works on learned VC consider end-to-end optimization of a sequential video codec based on R-D loss averaged over pairs of successive frames. It is well-known in conventional VC that hierarchical, bi-directional coding outperforms sequential compression because of its ability to use both past and future reference frames. This paper proposes a learned hierarchical bi-directional video codec (LHBDC) that combines the benefits of hierarchical motion-compensated prediction and end-to-end optimization. Experimental results show that we achieve the best R-D results that are reported for learned VC schemes to date in both PSNR and MS-SSIM. Compared to conventional video codecs, the R-D performance of our end-to-end optimized codec outperforms those of both x265 and SVT-HEVC encoders ("veryslow" preset) in PSNR and MS-SSIM as well as HM 16.23 reference software in MS-SSIM. We present ablation studies showing performance gains due to proposed novel tools such as learned masking, flow-field subsampling, and temporal flow vector prediction. The models and instructions to reproduce our results can be found in https://github.com/makinyilmaz/LHBDC/.
dc.description.fulltextYES
dc.description.indexedbyWoS
dc.description.indexedbyScopus
dc.description.indexedbyPubMed
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TÜBİTAK)
dc.description.sponsorshipTurkish Academy of Sciences (TÜBA)
dc.description.versionAuthor's final manuscript
dc.description.volume31
dc.formatpdf
dc.identifier.doi10.1109/TIP.2021.3138300
dc.identifier.eissn1941-0042
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR03476
dc.identifier.issn1057-7149
dc.identifier.linkhttps://doi.org/10.1109/TIP.2021.3138300
dc.identifier.quartileQ1
dc.identifier.scopus2-s2.0-85122586990
dc.identifier.urihttps://hdl.handle.net/20.500.14288/3463
dc.identifier.wos739998500007
dc.keywordsBidirectional control
dc.keywordsImage coding
dc.keywordsVideo compression
dc.keywordsMotion compensation
dc.keywordsOptimization
dc.keywordsEntropy
dc.keywordsVideo codecs
dc.keywordsLearned video compression
dc.keywordsLearned bi-directional motion compensation
dc.keywordsFlow field sub-sampling
dc.keywordsFlow vector prediction
dc.keywordsEnd-to-end optimization
dc.languageEnglish
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.relation.grantno2.17E+35
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/10269
dc.sourceIEEE Transactions on Image Processing
dc.subjectComputer science
dc.subjectArtificial intelligence
dc.subjectEngineering, electrical and electronic
dc.titleEnd to end rate distortion optimized learned hierarchical bi-directional video compression
dc.typeJournal Article
dspace.entity.typePublication
local.contributor.authorid0000-0003-1465-8121
local.contributor.authoridN/A
local.contributor.kuauthorTekalp, Ahmet Murat
local.contributor.kuauthorYılmaz, Mustafa Akın
relation.isOrgUnitOfPublication21598063-a7c5-420d-91ba-0cc9b2db0ea0
relation.isOrgUnitOfPublication.latestForDiscovery21598063-a7c5-420d-91ba-0cc9b2db0ea0

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
10269.pdf
Size:
1.55 MB
Format:
Adobe Portable Document Format