Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
 

Besmira Nushi

Affiliation: Microsoft
Title: A Constraint-Satisfaction Lens on Factual Errors of Language Models

 

Abstract

Many real-world information retrieval (IR) queries consist of specific requirements and constraints that users articulate in natural language (e.g., 'a list of ice cream shops in San Diego'). In the past, constraint satisfaction queries in IR were considered to be tasks that could only be solved via web-search or knowledge bases. More recently, large language models (LLMs) have demonstrated initial emergent abilities in this task, but major concerns remain related to information fabrication and factual errors. In this talk, we will discuss how using a constraint satisfaction lens to evaluate and understand capabilities of large language models (LLMs) can help with measuring and debugging failures of models in this space.

The first part of this talk will focus on describing KITAB, a new dataset for measuring constraint satisfaction abilities of language models in the literature domain. Evaluation of state-of-the-art models in this dataset shows that current models still have major gaps in understanding and following constraints, regardless the availability of context. Then, the talk will deep dive into describing recent efforts in mechanistically understanding factual errors of models when they fail to satisfy constraints. Specifically, we discover a strong positive relation between the model's attention to constraint tokens and the factual accuracy of its responses. The work also proposes SAT Probe, a method probing self-attention patterns, that can predict constraint satisfaction and factual errors, and allows early error identification. The talk will conclude with summarizing more relevant work in this space and future avenues.

Short CV

Besmira Nushi is a Principal Research Manager at Microsoft Research in the AI Frontiers lab, where she leads model evaluation and understanding initiatives. Her research focuses on responsible AI, interpretability in Machine Learning, and advancing rigorous model evaluation practices. Passionate about scaling these practices, she develops practitioner-oriented tools to drive innovation in AI responsibly and effectively. She has significantly contributed to the implementation and deployment of cutting-edge Machine Learning tools for evaluating, debugging, and comparing models and systems. Notable examples of her work include Eureka, Responsible AI Toolbox, Error Analysis, and BackwardCompatibilityML.