نبذة مختصرة : In this talk, I will first discuss deep learning models that can find semantically meaningful representations of words, learn to read documents and answer questions about their content. I will introduce methods that can augment neural representation of text with structured data from Knowledge Bases (KBs) for question answering, and show how we can answer complex compositional questions over long structured documents using a text corpus as a virtual KB. In the second part of the talk, I will show how we can design modular hierarchical reinforcement learning agents for visual navigation that can handle multi-modal inputs, perform tasks specified by natural language instructions, perform efficient exploration and long-term planning, build and utilize 3D semantic maps to learn both action and perception models in self-supervised manner, while generalizing across domains and tasks.
No Comments.