Publication: Effect of architectures and training methods on the performance of learned video frame prediction
dc.contributor.department | N/A | |
dc.contributor.department | Department of Electrical and Electronics Engineering | |
dc.contributor.kuauthor | Yılmaz, Mustafa Akın | |
dc.contributor.kuauthor | Tekalp, Ahmet Murat | |
dc.contributor.kuprofile | PhD Student | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.other | Department of Electrical and Electronics Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | 26207 | |
dc.date.accessioned | 2024-11-09T23:14:39Z | |
dc.date.issued | 2019 | |
dc.description.abstract | We analyze the performance of feedforward vs. recurrent neural network (RNN) architectures and associated training methods for learned frame prediction. To this effect, we trained a residual fully convolutional neural network (FCNN), A convolutional RNN (CRNN), and a convolutional long short-term memory (CLSTM) network for next frame prediction using the mean square loss. We performed both stateless and stateful training for recurrent networks. Experimental results show that the residual FCNN architecture performs the best in terms of peak signal to noise ratio (PSNR) at the expense of higher training and test (inference) computational complexity. the CRNN can be trained stably and very efficiently using the stateful truncated backpropagation through time procedure, and it requires an order of magnitude less inference runtime to achieve near real-time frame prediction with an acceptable performance. | |
dc.description.indexedby | WoS | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | TÜBİTAK | |
dc.description.sponsorship | TUBITAK[217E033] | |
dc.description.sponsorship | Turkish academy of Sciences (TUBa) This work was supported by TUBITAKproject 217E033. a. Murat Tekalp also acknowledges support from Turkish academy of Sciences (TUBa). | |
dc.identifier.doi | N/A | |
dc.identifier.isbn | 978-1-5386-6249-6 | |
dc.identifier.issn | 1522-4880 | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85076821510 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/10180 | |
dc.identifier.wos | 521828604061 | |
dc.keywords | Frame prediction | |
dc.keywords | Deep learning | |
dc.keywords | Recurrent neural networks | |
dc.keywords | Stateful training | |
dc.keywords | Convolutional neural networks | |
dc.language | English | |
dc.publisher | IEEE | |
dc.source | 2019 IEEE international Conference on Image Processing (Icip) | |
dc.subject | Diagnostic imaging | |
dc.subject | Photography | |
dc.title | Effect of architectures and training methods on the performance of learned video frame prediction | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-0795-8970 | |
local.contributor.authorid | 0000-0003-1465-8121 | |
local.contributor.kuauthor | Yılmaz, Mustafa Akın | |
local.contributor.kuauthor | Tekalp, Ahmet Murat | |
relation.isOrgUnitOfPublication | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 21598063-a7c5-420d-91ba-0cc9b2db0ea0 |