Many real-world information retrieval (IR) queries consist of specific requirements and constraints that users articulate in natural language (e.g., 'a list of ice cream shops in San Diego'). In the past, constraint satisfaction queries in IR were considered to be tasks that could only be solved via web-search or knowledge bases. More recently, large language models (LLMs) have demonstrated initial emergent abilities in this task, but major concerns remain related to information fabrication and factual errors. In this talk, we will discuss how using a constraint satisfaction lens to evaluate and understand capabilities of large language models (LLMs) can help with measuring and debugging failures of models in this space.
The first part of this talk will focus on describing KITAB, a new dataset for measuring constraint satisfaction abilities of language models in the literature domain. Evaluation of state-of-the-art models in this dataset shows that current models still have major gaps in understanding and following constraints, regardless the availability of context. Then, the talk will deep dive into describing recent efforts in mechanistically understanding factual errors of models when they fail to satisfy constraints. Specifically, we discover a strong positive relation between the model's attention to constraint tokens and the factual accuracy of its responses. The work also proposes SAT Probe, a method probing self-attention patterns, that can predict constraint satisfaction and factual errors, and allows early error identification. The talk will conclude with summarizing more relevant work in this space and future avenues.