Kevin Gimpel

Listed by year below; also see my profiles on Google Scholar and Semantic Scholar.

2024

MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy
Davis Yoshida, Kartik Goyal, Kevin Gimpel
ACL 2024 (outstanding paper and SAC award)
[arxiv] [bib]

Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
Freda Shi, Kevin Gimpel, Karen Livescu
ACL 2024
[arxiv] [bib]

GEE! Grammar Error Explanation with Large Language Models
Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Kevin Gimpel, Mohit Iyyer
Findings of NAACL 2024
[arxiv] [bib]

2023

Audio-Visual Neural Syntax Acquisition
Cheng-I Lai*, Freda Shi*, Puyuan Peng*, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass
ASRU 2023
[arxiv] [code]

The Benefits of Label-Description Training for Zero-Shot Text Classification
Lingyu Gao, Debanjan Ghosh, Kevin Gimpel
EMNLP 2023
[arxiv]

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Srivastava et al.
TMLR
[arxiv] [bib]

2022

Moment Distributionally Robust Tree Structured Prediction
Yeshu Li, Danyal Saeed, Xinhua Zhang, Brian D. Ziebart, Kevin Gimpel
NeurIPS 2022
[paper] [bib]

Baked-in State Probing
Shubham Toshniwal, Sam Wiseman, Karen Livescu, Kevin Gimpel
Findings of EMNLP 2022
[paper] [bib]

Paraphrastic Representations at Scale
John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick
EMNLP 2022 Demos
[arxiv] [code]

"What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation
Lingyu Gao, Debanjan Ghosh, Kevin Gimpel
*SEM 2022
[arxiv] [data] [bib]

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing
Freda Shi, Kevin Gimpel, Karen Livescu
ACL 2022
[paper] [bib]

SummScreen: A Dataset for Abstractive Screenplay Summarization
Mingda Chen, Zewei Chu, Sam Wiseman, Kevin Gimpel
ACL 2022
[arxiv] [data] [bib]

Chess as a Testbed for Language Model State Tracking
Shubham Toshniwal, Sam Wiseman, Karen Livescu, Kevin Gimpel
AAAI 2022
[arxiv] [code/data] [bib]

Deep Clustering of Text Representations for Supervision-free Probing of Syntax
Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan
AAAI 2022
[arxiv] [bib]

2021

On Generalization in Coreference Resolution
Shubham Toshniwal, Patrick Xia, Sam Wiseman, Karen Livescu, Kevin Gimpel
Fourth Workshop on Computational Models of Reference, Anaphora, and Coreference (best short paper award)
[arxiv] [code] [bib]

TVRecap: A Dataset for Generating Stories with Character Descriptions
Mingda Chen, Kevin Gimpel
[arxiv]

Reconsidering the Past: Optimizing Hidden States in Language Models
Davis Yoshida, Kevin Gimpel
Findings of EMNLP 2021
[arxiv] [bib]

Exemplar-Controllable Paraphrasing and Translation using Bitext
Mingda Chen, Sam Wiseman, Kevin Gimpel
[arxiv]

NatCat: Weakly Supervised Text Classification with Naturally Annotated Resources
Zewei Chu, Karl Stratos, Kevin Gimpel
AKBC 2021
[paper] [data]

Substructure Substitution: Structured Data Augmentation for NLP
Haoyue Shi, Karen Livescu, Kevin Gimpel
Findings of ACL 2021
[paper] [bib]

Unsupervised Label Refinement Improves Dataless Text Classification
Zewei Chu, Karl Stratos, Kevin Gimpel
Findings of ACL 2021
[paper] [code] [bib]

WikiTableT: A Large-Scale Data-to-Text Dataset for Generating Wikipedia Article Sections
Mingda Chen, Sam Wiseman, Kevin Gimpel
Findings of ACL 2021
[arxiv] [data] [bib]

FlowPrior: Learning Expressive Priors for Latent Variable Sentence Models
Xiaoan Ding, Kevin Gimpel
NAACL 2021
[paper] [bib]

2020

Improving Joint Training of Inference Networks and Structured Prediction Energy Networks
Lifu Tu, Richard Yuanzhe Pang, Kevin Gimpel
4th Workshop on Structured Prediction for NLP
[arxiv] [bib]

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks
Lifu Tu*, Tianyu Liu*, Kevin Gimpel
EMNLP 2020
[arxiv] [bib]

Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference
Xiaoan Ding*, Tianyu Liu*, Baobao Chang, Zhifang Sui, Kevin Gimpel
EMNLP 2020
[arxiv] [bib]

Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks
Shubham Toshniwal, Sam Wiseman, Allyson Ettinger, Karen Livescu, Kevin Gimpel
EMNLP 2020
[arxiv] [code] [bib]

On the Role of Supervision in Unsupervised Constituency Parsing
Haoyue Shi, Karen Livescu, Kevin Gimpel
EMNLP 2020
[arxiv] [bib]

Mining Knowledge for Natural Language Inference from Wikipedia Categories
Mingda Chen*, Zewei Chu*, Karl Stratos, Kevin Gimpel
Findings of EMNLP 2020
[arxiv] [bib]

Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Davis Yoshida, Allyson Ettinger, Kevin Gimpel
[arxiv]

A Cross-Task Analysis of Text Span Representations
Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel
5th Workshop on Representation Learning for NLP
[arxiv] [bib]

Learning Probabilistic Sentence Representations from Paraphrases
Mingda Chen, Kevin Gimpel
5th Workshop on Representation Learning for NLP
[arxiv] [bib]

Distractor Analysis and Selection for Multiple-Choice Cloze Questions for Second-Language Learners
Lingyu Gao, Kevin Gimpel, Arnar Jensson
15th Workshop on Innovative Use of NLP for Building Educational Applications
[paper] [bib]

ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel
ACL 2020
[arxiv] [code] [bib]

PeTra: A Sparsely Supervised Memory Model for People Tracking
Shubham Toshniwal, Allyson Ettinger, Kevin Gimpel, Karen Livescu
ACL 2020
[arxiv] [code] [colab] [bib]

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
ICLR 2020
[arxiv] [code/models] [bib]

How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions
Zewei Chu, Mingda Chen*, Jing Chen*, Miaosen Wang*, Kevin Gimpel, Manaal Faruqui, Xiance Si
AAAI 2020
[arxiv] [data] [bib]

2019

Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer
Richard Yuanzhe Pang, Kevin Gimpel
3rd Workshop on Neural Generation and Translation (WNGT 2019)
[arxiv] [bib]

Generating Diverse Story Continuations with Controllable Semantics
Lifu Tu, Xiaoan Ding, Dong Yu, Kevin Gimpel
3rd Workshop on Neural Generation and Translation (WNGT 2019)
[arxiv] [bib]

EntEval: A Holistic Evaluation Benchmark for Entity Representations
Mingda Chen*, Zewei Chu*, Yang Chen, Karl Stratos, Kevin Gimpel
EMNLP 2019
[arxiv] [code] [bib]

Evaluation Benchmarks and Learning Criteria for Discourse-Aware Sentence Representations
Mingda Chen*, Zewei Chu*, Kevin Gimpel
EMNLP 2019
[arxiv] [code] [bib]

Latent-Variable Generative Models for Data-Efficient Text Classification
Xiaoan Ding, Kevin Gimpel
EMNLP 2019
[arxiv] [bib]

Sequence-to-Sequence Modeling for Graph Representation Learning
Aynaz Taheri, Kevin Gimpel, Tanya Berger-Wolf
Applied Network Science
[paper] [bib]

Beyond BLEU: Training Neural Machine Translation with Semantic Similarity
John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig
ACL 2019
[arxiv] [code] [bib]

Controllable Paraphrase Generation with a Syntactic Exemplar
Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel
ACL 2019
[arxiv] [code] [data] [bib]

Simple and Effective Paraphrastic Similarity from Parallel Translations
John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick
ACL 2019
[arxiv] [code] [bib]

Visually Grounded Neural Syntax Acquisition
Haoyue Shi*, Jiayuan Mao*, Kevin Gimpel, Karen Livescu
ACL 2019
[arxiv] [project page] [code] [bib]

Learning to Represent the Evolution of Dynamic Graphs with Recurrent Models
Aynaz Taheri, Kevin Gimpel, Tanya Berger-Wolf
WWW 2019 (Companion Volume)
[paper] [bib]

A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations
Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel
NAACL 2019
[arxiv] [code] [bib]

Benchmarking Approximate Inference Methods for Neural Structured Prediction
Lifu Tu, Kevin Gimpel
NAACL 2019
[arxiv] [bib]

PoMo: Generating Entity-Specific Post-Modifiers in Context
Jun Seok Kang, Robert L. Logan IV, Zewei Chu, Yang Chen, Dheeru Dua, Kevin Gimpel, Sameer Singh, Niranjan Balasubramanian
NAACL 2019
[arxiv] [data] [bib]

2018

Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise
Dan Hendrycks*, Mantas Mazeika*, Duncan Wilson, Kevin Gimpel
NeurIPS 2018
[arxiv] [code] [bib]

Variational Sequential Labelers for Semi-Supervised Learning
Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel
EMNLP 2018
[paper] [arxiv] [supplementary material] [code] [bib]

ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations
John Wieting, Kevin Gimpel
ACL 2018
[paper] [supplementary material] [data] [code] [bib]

Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
Mohit Iyyer*, John Wieting*, Kevin Gimpel, Luke Zettlemoyer
NAACL 2018
[arxiv] [code] [bib]

Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information
Trang Tran*, Shubham Toshniwal*, Mohit Bansal, Kevin Gimpel, Karen Livescu, Mari Ostendorf
NAACL 2018
[arxiv] [code] [bib]

Smaller Text Classifiers with Discriminative Cluster Embeddings
Mingda Chen, Kevin Gimpel
NAACL 2018
[paper] [code] [bib]

Quality Signals in Generated Stories
Manasvi Sagarkar, John Wieting, Lifu Tu, Kevin Gimpel
*SEM 2018
[paper] [data] [scorer] [bib]

Learning Approximate Inference Networks for Structured Prediction
Lifu Tu, Kevin Gimpel
ICLR 2018
[arxiv] [code] [another codebase] [bib]

A Study of All-Convolutional Encoders for Connectionist Temporal Classification
Kalpesh Krishna, Liang Lu, Kevin Gimpel, Karen Livescu
ICASSP 2018
[arxiv] [bib]

2017

End-to-End Neural Segmental Models for Speech Recognition
Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals
IEEE Journal of Selected Topics in Signal Processing
[arxiv] [bib]

Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext
John Wieting, Jonathan Mallinson, Kevin Gimpel
EMNLP 2017
[arxiv] [bib]

Learning to Embed Words in Context for Syntactic Tasks
Lifu Tu, Kevin Gimpel, Karen Livescu
2nd Workshop on Representation Learning for NLP (best paper award)
[arxiv] [bib]

Emergent Predication Structure in Hidden State Vectors of Neural Readers
Hai Wang, Takeshi Onishi, Kevin Gimpel, David McAllester
2nd Workshop on Representation Learning for NLP (best paper award)
[arxiv] [bib]

Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings
John Wieting, Kevin Gimpel
ACL 2017
[arxiv] [code] [bib]

Pay Attention to the Ending: Strong Neural Baselines for the ROC Story Cloze Task
Zheng Cai, Lifu Tu, Kevin Gimpel
ACL 2017
[paper] [bib]

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
Dan Hendrycks, Kevin Gimpel
ICLR 2017
[arxiv] [bib]

Early Methods for Detecting Adversarial Images
Dan Hendrycks, Kevin Gimpel
ICLR 2017 (workshop contribution)
[arxiv] [bib]

Broad Context Language Modeling as Reading Comprehension
Zewei Chu, Hai Wang, Kevin Gimpel, David McAllester
EACL 2017
[arxiv] [slides] [training data (330MB)] [manual analysis] [bib]

2016

Constraints Based Convex Belief Propagation
Yaniv Tenzer, Alexander Schwing, Kevin Gimpel, Tamir Hazan
NIPS 2016
[paper] [bib]

End-to-End Training Approaches for Discriminative Segmental Models
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
SLT 2016
[arxiv] [bib]

Charagram: Embedding Words and Sentences via Character n-grams
John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu
EMNLP 2016
[arxiv] [code and models] [bib]

Who did What: A Large-Scale Person-Centered Cloze Dataset
Takeshi Onishi, Hai Wang, Mohit Bansal, Kevin Gimpel, David McAllester
EMNLP 2016
[arxiv] [data] [bib]

Adjusting for Dropout Variance in Batch Normalization and Weight Initialization
Dan Hendrycks, Kevin Gimpel
[arxiv]

Gaussian Error Linear Units (GELUs)
Dan Hendrycks, Kevin Gimpel
[arxiv] [bib]

Efficient Segmental Cascades for Speech Recognition
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
Interspeech 2016
[arxiv] [bib]

Mapping Unseen Words to Task-Trained Embedding Spaces
Pranava Swaroop Madhyastha, Mohit Bansal, Kevin Gimpel, Karen Livescu
1st Workshop on Representation Learning for NLP (best paper award)
[arxiv] [bib]

Commonsense Knowledge Base Completion
Xiang Li, Aynaz Taheri, Lifu Tu, Kevin Gimpel
ACL 2016
[paper] [resources] [bib]

UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement
Hua He, John Wieting, Kevin Gimpel, Jinfeng Rao, Jimmy Lin
SemEval 2016
[paper] [bib]

Towards Universal Paraphrastic Sentence Embeddings
John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu
ICLR 2016
[arxiv] [code] [embeddings] [bib]

2015

Discriminative Segmental Cascades for Feature-Rich Phone Recognition
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
ASRU 2015 (best paper nominee)
[arxiv] [bib]

Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
Hua He, Kevin Gimpel, Jimmy Lin
EMNLP 2015
[paper] [poster] [code] [bib]

Machine Comprehension with Syntax, Frames, and Semantics
Hai Wang, Mohit Bansal, Kevin Gimpel, David McAllester
ACL 2015
[paper] [bib]

From Paraphrase Database to Compositional Paraphrase Model and Back
John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu, Dan Roth
TACL 2015 (presented at EMNLP 2015)
[paper] [embeddings/data/code] [bib]

Deep Multilingual Correlation for Improved Word Embeddings
Ang Lu, Weiran Wang, Mohit Bansal, Kevin Gimpel, Karen Livescu
NAACL 2015
[paper] [code] [bib]

A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment
Jing Wang, Mohit Bansal, Kevin Gimpel, Brian D. Ziebart, Clement T. Yu
TACL 2015 (presented at NAACL 2015)
[paper] [poster] [one-minute madness slide] [bib]

2014

Weakly-Supervised Learning with Cost-Augmented Contrastive Estimation
Kevin Gimpel, Mohit Bansal
EMNLP 2014
[paper] [supplementary material] [slides] [talk] [bib]

A Comparison of Training Approaches for Discriminative Segmental Models
Hao Tang, Kevin Gimpel, Karen Livescu
Interspeech 2014
[paper] [code] [bib]

Tailoring Continuous Word Representations for Dependency Parsing
Mohit Bansal, Kevin Gimpel, Karen Livescu
ACL 2014
[paper] [slides] [data] [bib]

Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features
Kevin Gimpel, Noah A. Smith
Computational Linguistics
[paper] [bib]

2013

A Systematic Exploration of Diversity in Machine Translation
Kevin Gimpel, Dhruv Batra, Chris Dyer, Gregory Shakhnarovich
EMNLP 2013
[paper] [supplementary material] [poster] [bib]

Predicting the NFL using Twitter
Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah A. Smith
ECML/PKDD 2013 Workshop on Machine Learning and Data Mining for Sports Analytics
[paper] [data] [bib]

Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
Olutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, Noah A. Smith
NAACL 2013
[paper] [poster] [data and software] [bib]

2012

Discriminative Feature-Rich Modeling for Syntax-Based Machine Translation
Kevin Gimpel
Ph.D. Thesis, Language Technologies Institute, Carnegie Mellon University
[thesis] [bib]

Part-of-Speech Tagging for Twitter: Word Clusters and Other Advances
Olutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider
Technical report CMU-ML-12-107
[paper] [data and software] [bib]

Word Salad: Relating Food Prices and Descriptions
Victor Chahuneau, Kevin Gimpel, Bryan R. Routledge, Lily Scherlis, Noah A. Smith
EMNLP 2012
[paper] [supplementary material] [bib]

Concavity and Initialization for Unsupervised Dependency Parsing
Kevin Gimpel, Noah A. Smith
NAACL 2012
[paper] [slides] [bib]

Structured Ramp Loss Minimization for Machine Translation
Kevin Gimpel, Noah A. Smith
NAACL 2012
[paper] [addendum] [poster] [code] [bib]

2011

Generative Models of Monolingual and Bilingual Gappy Patterns
Kevin Gimpel, Noah A. Smith
WMT 2011
[paper] [slides] [code] [sample patterns] [bib]

The CMU-ARK German-English Translation System
Chris Dyer, Kevin Gimpel, Jonathan H. Clark, Noah A. Smith
WMT 2011
[paper] [bib]

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Kevin Gimpel, Noah A. Smith
EMNLP 2011
[paper] [slides] [bib]

Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
Kevin Gimpel, Nathan Schneider, Brendan O'Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, Noah A. Smith
ACL 2011
[paper] [slides] [data and software] [bib]

2010

Learning Structured Classifiers with Dual Coordinate Ascent
André F. T. Martins, Kevin Gimpel, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Technical report CMU-ML-10-109
[paper] [bib]

Distributed Asynchronous Online Learning for Natural Language Processing
Kevin Gimpel, Dipanjan Das, Noah A. Smith
CoNLL 2010
[paper] [slides] [bib]

Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions
Kevin Gimpel, Noah A. Smith
NAACL 2010
[paper] [extended version] [slides] [bib]

Movie Reviews and Revenues: An Experiment in Text Regression
Mahesh Joshi, Dipanjan Das, Kevin Gimpel, Noah A. Smith
NAACL 2010
[paper] [poster] [bib]

2009

Feature-Rich Translation by Quasi-Synchronous Lattice Parsing
Kevin Gimpel, Noah A. Smith
EMNLP 2009
[paper] [slides] [bib]

Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings
Kevin Gimpel, Noah A. Smith
EACL 2009
[paper] [slides] [bib]

2008

Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction
Shay B. Cohen, Kevin Gimpel, Noah A. Smith
NIPS 2008
[paper] [code] [bib]

Rich Source-Side Context for Statistical Machine Translation
Kevin Gimpel, Noah A. Smith
WMT 2008 (5-year retrospective best paper award)
[paper] [code for significance testing] [bib]

Other Papers/Presentations (unpublished):

Beating the NFL Football Point Spread. If you're interested in the data I used, check out my brother's company.