Skip to main content

Information retrieval (IR) has been a dominant force in the last 20 years of computing. Even within the domain of natural language processing (NLP), tasks such as summarization and question answering have come a long way based on IR-like approaches designed to surface information that already exists.

In this talk, I argue that the field of NLP is seeing a shift towards information synthesis: methods that can combine existing pieces of information to produce new conclusions. Systems built with such methods promise to produce greater insights for their users than pure retrievers, but there are many challenges which still need to be addressed.

I will first discuss our work on models that can take combinations of premise statements and deduce conclusions from them to construct natural language “proofs” of hypotheses, paving the way for explainable textual reasoning. I will then describe some shortcomings of doing this kind of reasoning with large language models and suggest how explanations can help calibrate the inferences they make. Finally, I will discuss the recent impact of ChatGPT and GPT-4 on text summarization, showing how the incredible new synthesis capabilities of these models will need to be fleshed out and benchmarked in the coming years.

Presenter

Greg Durrett
University of Texas at Austin

Greg Durrett is an assistant professor of Computer Science at UT Austin. His research focuses on techniques for accessing and reasoning about knowledge in text. Large language models (LLMs) like ChatGPT and GPT-4 have dramatically advanced the frontiers in this area; currently his team is looking at where these systems succeed and fail and how to enhance their capabilities, particularly via systems that use LLMs as primitives. He is a 2023 Sloan Research Fellow and a recipient of a 2022 NSF CAREER award, among other grants from agencies including the NSF, Open Philanthropy, DARPA, Salesforce, and Amazon. He completed his Ph.D. at UC Berkeley where he was advised by Dan Klein, and he was previously a research scientist at Semantic Machines.