Leveraging auxiliary image descriptions for dense video captioning

Publication:
Leveraging auxiliary image descriptions for dense video captioning

Departments

Organizational Unit

Department of Computer Engineering

School / College / Institute

Organizational Unit

College of Engineering

KU-Authors

Erdem, Aykut

Co-Authors

Boran, Emre

İkizler-Cinbiş, Nazlı

Erdem, Erkut

Madhyastha, Pranava

Specia, Lucia

Publication Date

2021

Type

Journal Article

Abstract

Collecting textual descriptions is an especially costly task for dense video captioning, since each event in the video needs to be annotated separately and a long descriptive paragraph needs to be provided. In this paper, we investigate a way to mitigate this heavy burden and propose to leverage captions of visually similar images as auxiliary context. Our model successfully fetches visually relevant images and combines noun and verb phrases from their captions to generating coherent descriptions. To this end, we use a generator and discriminator design, together with an attention-based fusion technique, to incorporate image captions as context in the video caption generation process. The experiments on the challenging ActivityNet Captions dataset demonstrate that our proposed approach achieves more accurate and more diverse video descriptions compared to the strong baseline using METEOR, BLEU and CIDEr-D metrics and qualitative evaluations. (c) 2021 Published by Elsevier B.V.

Publisher

Elsevier

Subject

Computer science, Artificial intelligence

Source

Pattern Recognition Letters

DOI

10.1016/j.patrec.2021.02.009

URI

https://doi.org/10.1016/j.patrec.2021.02.009
https://hdl.handle.net/20.500.14288/6881

Publication:
Leveraging auxiliary image descriptions for dense video captioning

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads

Publication: Leveraging auxiliary image descriptions for dense video captioning

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads

Publication:
Leveraging auxiliary image descriptions for dense video captioning