Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
 

Julian Eisenschlos

Affiliation: Google DeepMind
Title: Visual language: How generating drives understanding

 

Abstract

Large amounts of content both online and offline relies on structure to organize and communicate information more effectively. While natural image understanding and generation has been studied extensively, visually situated language such as tables, charts, plots, and infographics, continues to be a challenge for models large and small. In this talk we will show how teaching models to generate visually situated language can improve downstream reading and reasoning on this data modality for tasks such as question answering, entailment and summarization.