Visually grounded language learning for robot navigation

We present an end-to-end deep learning model for robot navigation from raw visual pixel input and natural text instructions. The proposed model is an LSTM-based sequence-to-sequence neural network architecture with attention, which is trained on instructionperception data samples collected in a synthetic environment. We conduct experiments on the SAIL dataset which we reconstruct in 3D so as to generate the 2D images associated with the data. Our experiments show that the performance of our model is on a par with state-of-the-art, despite the fact that it learns navigational language with end-to-end training from raw visual data.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer engineering

Source

MULEA '19: 1st International Workshop on Multimodal Understanding and Learning for Embodied Applications

DOI

10.1145/3347450.3357655

URI

https://doi.org/10.1145/3347450.3357655

Collections

Publications with Fulltext

Full item page

0

Views

3

Downloads

View PlumX Details

Publication:
Visually grounded language learning for robot navigation

Files

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

3

Downloads

Publication: Visually grounded language learning for robot navigation

Files

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Publication Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

0

Views

3

Downloads

Publication:
Visually grounded language learning for robot navigation