CLunch is the weekly Computational Linguistics lunch run by the NLP group. We invite external and internal speakers to come and present their research on natural language processing, computational linguistics, and machine learning.

Interested in attending CLunch? Sign up for our mailing list here.

View older talks at the CLunch archive.

Upcoming Talks

Spring 2023






Past Talks

Past talks from the current and previous semesters are shown below. View older talks at the CLunch archive.

Hao Wang

Rutgers University

April 19th, 2023

Bayesian Deep Learning: From Single-Domain Reasoning to Infinite-Domain Adaptation

While perception tasks such as visual object recognition and text understanding play an important role in human intelligence, the subsequent tasks that involve inference, reasoning, and planning require an even higher level of intelligence. The past few years have seen major advances in many perception tasks using deep learning models. In terms of higher-level inference, however, probabilistic graphical models, with their ability to expressively describe properties of variables and various probabilistic relations among variables, are still more powerful and flexible. To achieve integrated intelligence that involves both perception and inference, we have been exploring along a research direction, which we call Bayesian deep learning, to tightly integrate deep learning and Bayesian models within a principled probabilistic framework. In this talk, I will present the proposed unified framework and some of our recent work on Bayesian deep learning with various applications including recommendation, social network analysis, interpretable healthcare, domain adaptation, and representation learning.

Sunny Rai

University of Pennsylvania

April 12th, 2023

Investigating Racial Heterogeneity in Language Markers of Depression

The racial and ethnic differences in the manifestation of depression are well documented. However, the effect of these differences on computational models for mental disorders trained on online language is relatively unexplored. This work analyzes the interaction between race and linguistic features correlated with PHQ-9 score. Our experiments reveal that the pronoun I, widely used as an indicator of depression, has significant interaction with race correlating with PHQ-9 scores for White but not for Black individuals. Various open vocabulary topics correlated with PHQ-9 demonstrate a contradictory trend for their usage by White and Black individuals when depressed. A linear regression machine learning model trained on White individuals predicts depression in White individuals with a Pearson r of 0.39(p < 0.05) but returns an insignificant correlation for depression scores in Black individuals indicating its inefficacy in diagnosing depression for the Black population. Interestingly, a model trained on Black individuals predicts depression in both racial groups albeit with different performances (r = 0.355 for Black and r = 0.338 for White). The results advocate the urgent need to validate computational mental health models on minority populations before deployment.

Nanyun (Violet) Peng

University of California, Los Angeles

April 7th, 2023

Controllable Text Generation For Open-World Creativity

Recent advances in large auto-regressive language models have demonstrated strong results in generating natural languages and significantly improved the performances for applications such as machine translation and summarization. However, when the generation tasks are open-ended and the content is under-specified, or there are format or cross-modal association constraints, existing techniques struggle to generate long-term coherent and creative contents that follow format constraints. This happens because autoregressive language models are only trained to predict the next word, and it is hard to impose structural or content control/contraints to the model. In this talk, I will present our recent works on creative generation including poetry and melogy-to-lyrics generation, which highlight the importance of controllable text generation beyond the prevalent auto-regressive formulation. We propose a novel insertion-based generation model and a controllable decoding-time algorithm to steer models to better conform to constraints.

Roy Schwartz

Hebrew University of Jerusalem

March 29th, 2023

Spurious Correlations: Challenges, Solutions, and Opportunities

Recent work has shown that deep learning models in NLP are highly sensitive to low-level correlations between simple features and specific output labels, leading to overfitting and lack of generalization. To mitigate this problem, a common practice is to balance datasets by adding new instances or by filtering out "easy'' instances, culminating in a recent proposal to eliminate single-word correlations altogether. In this talk, I will identify that despite these efforts, increasingly-powerful models keep exploiting ever-smaller spurious correlations, and as a result even balancing all single-word features is insufficient for mitigating all of these correlations. In parallel, a truly balanced dataset may be bound to "throw the baby out with the bathwater'' and miss important signals encoding common sense and world knowledge. I will highlight several alternatives to dataset balancing, focusing on a surprising proposal: in order to mitigate biases in models, one needs to amplify them in our training sets.

Yoav Artzi

Cornell University (Cornell Tech)

March 22nd, 2023

Learning and Reasoning in Natural Language Interaction

Natural language is first and foremost an instrument of interaction, where interlocutors produce and comprehend language to relay information to accomplish their intents. This talk focuses on challenges and opportunities that arise from this interactive nature of language. The response of participants to the language they comprehend can form a strong learning signal for the party that produced the language. Did I achieve my intent? In the first part, I will show how to use this signal to learn to produce natural language instructions. I will then discuss the problem of language-conditioned reinforcement learning, where benchmark development has been hindered because computing rewards requires resolving language semantics. I will describe a new approach to address this challenge. Finally, core to linguistic interaction is the use of abstraction to communicate concepts in a generalizable way. I will describe a new resource to study this phenomena, and show how it sheds light on the generalization abilities of language-and-vision pre-trained models.

Daniel Fried

Carnegie Mellon University (Language Technologies Institute)

March 15th, 2023

Using Language Strategically in Context

As NLP systems interact with people in a widening range of world contexts, it is increasingly important to model pragmatic aspects of language: the goals that underlie language use, and the effects that language has on people. Across a diverse range of task-oriented settings, we've found that reasoning about language as a strategic action allows NLP models to interact more successfully with human partners. First, I'll describe a procedure for pragmatically generating and interpreting instructions. We train listener and speaker models that imitate how people interpret and produce language in grounded contexts. We use these models to (1) predict how a person might interpret language from the system and (2) resolve ambiguity by reasoning about what goal might have made a person say what they did. These procedures make interaction with human partners more successful in settings including visually-grounded instruction following and interactive preference learning. I'll also give an overview of work with the FAIR Diplomacy team on CICERO, an agent that achieves human-level performance in the dialogue and strategy board game Diplomacy. CICERO integrates LLMs with a strategic planner: choosing mutually beneficial plans for itself and its partners, and generating dialogue in pursuit of these plans. When deployed in an anonymous online Diplomacy league with human partners, CICERO ranked in the top 10% of participants who played more than one game.

Niranjan Balasubramanian

Stony Brook University

March 6th, 2023

What ails multi-step reasoning and how to fix it.

Multi-step reasoning has seen much empirical progress on many datasets recently, especially in Question Answering. However, training and evaluating on typical crowdsourced datasets is problematic because of the potential for shortcut reasoning based on artifacts. What can we do about this? In this three part talk, I will first show how we can formalize and measure disconnected reasoning, a type of bad multihop reasoning. I will then discuss how we can construct new datasets using a bottom-up construction process, which allows us to better control for desired properties in the resulting dataset. In the third part, I will briefly present how synthetically generated data can be used to teach a broad range of multihop skills in a reliable manner and how to improve reliable multi-step reasoning in open-domain QA settings.

Graham Neubig

Carnegie Mellon University (Language Technology Institute)

March 1st, 2023

Is My NLP Model Working? The Answer is Harder Than You Think

As natural language processing now permeates many different applications, its practical use is unquestionable. However, at the same time NLP is still imperfect, and errors cause everything from minor inconveniences to major PR disasters. Better understanding when our NLP models work and when they fail is critical to the efficient and reliable use of NLP in real-world scenarios. So how can we do so? In this talk I will discuss two issues: automatic evaluation of generated text, and automatic fine-grained analysis of NLP system results, which are some first steps towards a science of NLP model evaluation.

Alan Ritter

Georgia Tech

February 24th, 2023

Towards Cost Efficient Use of Pre-Trained Language Models

Large language models are leading to breakthroughs in a variety of applications, from information extraction systems that are accurate and robust, to human-like conversational assistants. In this talk I will analyze when the benefits of training a new model outweigh the computational costs, in the context of domain adaptation. Conventional wisdom holds that data annotation is expensive, so computational methods that leverage freely available unlabeled data can present an economical alternative when adapting to a new domain. The talk will examine this assumption in the context of pretraining-based domain adaptation, which requires significant GPU/TPU resources for each new domain. We frame domain adaptation as a consumer choice problem: given a fixed budget, what combination of annotation and pre-training lead to maximum utility? In the second part of the talk, I will discuss recent work on in-context learning for anaphora resolution. I will show that resolving anaphora in scientific protocols is a challenging task for in-context learning, then present a new method, MICE (Mixtures of In-Context Experts) and demonstrate how it can accurately resolve multiple-antecedent anaphora in paragraphs describing chemical synthesis procedures. MICE enables accurate few-shot anaphora resolution by ensembling hundreds of prompts that are created from only a handful of training examples. Finally, I will discuss applications of NLP on chemical synthesis protocols and show a demo of a system that can help chemists more efficiently find experimental details described in the literature.

Julian Michael


February 15th, 2023

What Do NLP Researchers Believe? Results of the NLP Community Metasurvey

I will present the results of the NLP Community Metasurvey ( This was a questionnaire that we ran from May to June 2022 which elicited the opinions of NLP researchers on controversial issues, including industry influence in the field, concerns about AGI, and ethics. Our results put concrete numbers to several controversies. For example, respondents are split almost exactly in half on questions about: the importance of artificial general intelligence, whether language models understand language, and the necessity of linguistic structure and inductive bias for solving NLP problems. In addition, the survey posed "meta-questions," asking respondents to predict the distribution of survey responses. This allows us not only to gain insight on the spectrum of beliefs held by NLP researchers, but also to uncover false sociological beliefs where the community’s predictions don’t match reality. We find such mismatches on a wide range of issues. Among other results, the community greatly overestimates its own belief in the usefulness of benchmarks and the potential for scaling to solve real-world problems, while underestimating its own belief in the importance of linguistic structure, inductive bias, and interdisciplinary science. Our hope is that this can provide context for the NLP research community to have more informed and self-aware discussions of these complex issues. In this talk, I will walk through our results and open the floor for such a discussion.

Karl Stratos

Rutgers CS

February 1st, 2023

Retrieval-Augmented Models for Natural Language Processing

Prompting large pretrained language models has been enormously successful in solving a wide class of natural language tasks. In this approach, a task is formatted in some natural language template to "prompt" the model to generate the correct answer (e.g., "Q: Why is the sky blue? A: "). While surprisingly effective, it often generates false and unverifiable claims, limiting real-world applications. In this talk, I will advocate an alternative approach based on retrieval. Instead of naively generating answers, the model must first retrieve a piece of evidence from a knowledge base (e.g., Wikipedia). By having an explicit knowledge retrieval step, the model is forced to return factually accurate and verifiable claims. It can also use a new knowledge base at test time, thus is capable of zero-shot learning. I will focus on the task of entity retrieval and linking. I will first present a technique based on hard negative mining to make entity retrieval more robust (NAACL 2021). I will then build on the retrieval framework to present a novel paradigm for entity linking (ICLR 2022).

Gail Weiss


January 25, 2023

Thinking Like Transformers

Transformers - the purely attention based NN architecture - have emerged as a powerful tool in sequence processing. But how does a transformer think? When we discuss the computational power of RNNs, or consider a problem that they have solved, it is easy for us to think in terms of automata and their variants (such as counter machines and pushdown automata). But when it comes to transformers, no such intuitive model is available. In this talk I will present a programming language, RASP (Restricted Access Sequence Processing), which we hope will serve the same purpose for transformers as finite state machines do for RNNs. In particular, we will identify the base computations of a transformer and abstract them into a small number of primitives, which are composed into a small programming language. We will go through some example programs in the language, and discuss how a given RASP program relates to the transformer architecture.