Freda Shi

Greetings! I am an Assistant Professor in the David R. Cheriton School of Computer Science at the University of Waterloo and a Faculty Member at the Vector Institute. I received my Ph.D. in Computer Science from the Toyota Technological Institute at Chicago in 2024, where I was advised by Professors Karen Livescu and Kevin Gimpel, and was supported by a Google Ph.D. Fellowship. I completed my Bachelor's degree in Intelligence Science and Technology (Computer Science Track) in 2018 at Peking University, with a minor in Sociology.


My research interests are in computational linguistics and natural language processing. I work towards deeper understandings of natural language and the human language processing mechanism, as well as how these insights can inform the design of more efficient and effective NLP systems. Among all relevant topics, I am particularly interested in learning language through grounding, computational multilingualism, and related machine learning aspects. For more details, check out my publications and the CompLING Lab at the University of Waterloo.

Prospective students and visitors: please read this.


Publications show selected / show all by date / show all by topic

Topics: Syntax / Semantics / Multilingualism / Others (*: Equal Contribution)

LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLP
Danlu Chen, Freda Shi, Aditi Agarwal, Jacobo Myerston, Taylor Berg-Kirkpatrick

ACL 2024 Paper / Project Page

Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
Freda Shi, Kevin Gimpel, Karen Livescu

ACL 2024 Paper / Code / arXiv

Audio-Visual Neural Syntax Acquisition
Cheng-I Jeff Lai*, Freda Shi*, Puyuan Peng*, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass

ASRU 2023 Paper / Code / arXiv

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole et al.

NEJLT 2023 Paper / Code / arXiv

Large Language Models Can Be Easily Distracted by Irrelevant Context
Freda Shi*, Xinyun Chen*, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, Denny Zhou

ICML 2023 Paper / arXiv / Data

Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi*, Mirac Suzgun*, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

ICLR 2023 Paper / arXiv / Data

InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried*, Armen Aghajanyan*, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

ICLR 2023 Paper / Code / arXiv

Natural Language to Code Translation with Execution
Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang

EMNLP 2022 Paper / Code / arXiv

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing
Freda Shi, Kevin Gimpel, Karen Livescu

ACL 2022 Paper / Code / arXiv

Deep Clustering of Text Representations for Supervision-Free Probing of Syntax
Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

AAAI 2022 Paper / arXiv

Grammar-Based Grounded Lexicon Learning
Jiayuan Mao, Haoyue Shi, Jiajun Wu, Roger P. Levy, Joshua B. Tenenbaum

NeurIPS 2021 Paper / arXiv / Project Page

Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
Haoyue Shi, Luke Zettlemoyer, Sida I. Wang

ACL-IJCNLP 2021  Best Paper Nominee    Paper / Code / arXiv

Substructure Substitution: Structured Data Augmentation for NLP
Haoyue Shi, Karen Livescu, Kevin Gimpel

Findings of ACL-IJCNLP 2021 Paper / Code / arXiv

On the Role of Supervision in Unsupervised Constituency Parsing
Haoyue Shi, Karen Livescu, Kevin Gimpel

EMNLP 2020 Paper / arXiv

A Cross-Task Analysis of Text Span Representations
Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel

RepL4NLP 2020 Paper / Code / arXiv

Visually Grounded Neural Syntax Acquisition
Haoyue Shi*, Jiayuan Mao*, Kevin Gimpel, Karen Livescu

ACL 2019  Best Paper Nominee    Paper / Code / arXiv

On Tree-Based Neural Sentence Modeling
Haoyue Shi, Hao Zhou, Jiaze Chen, Lei Li

EMNLP 2018 Paper / Code / arXiv

On Multi-Sense Word Embeddings via Matrix Factorization and Matrix Transformation
Haoyue Shi

B.S. Thesis (in Simplified Chinese), Peking University School of EECS, May 2018    Paper
Best Undergraduate Dissertation Award, PKU School of EECS

Learning Visually-Grounded Semantics from Contrastive Adversarial Samples
Haoyue Shi*, Jiayuan Mao*, Tete Xiao*, Yuning Jiang, Jian Sun

COLING 2018 Paper / Code / arXiv

Constructing High Quality Sense-Specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-Sense
Haoyue Shi, Xihao Wang, Yuqi Sun, Junfeng Hu

LREC 2018 Paper / Code

Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video
Haoyue Shi, Jia Chen, Alexander G. Hauptmann

ICMR 2017 Paper

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation
Haoyue Shi, Caihua Li, Junfeng Hu

COLING Workshop CL4LC 2016 Paper / arXiv