Freda Shi

Greetings! I am a final-year Ph.D. student at the Toyota Technological Institute at Chicago. I am grateful to be advised by Professors Karen Livescu and Kevin Gimpel, and to be supported by a Google Ph.D. Fellowship since Autumn 2021. I received a B.S. in 2018 from the School of EECS at Peking University, where I minored in Sociology.


Research Interests

My research interests are in computational linguistics and natural language processing, and I am particularly interested in learning language through grounding, multilingual NLP and related topics. Representative work includes the grounded syntax and semantics learners, the contextualized bilingual lexicon inducer, and the substructure-based zero-shot cross-lingual dependency parser. Recently, I have also worked on searching for evidence of semantics encoded in large language models, both in support of and in opposition to Bender & Koller (2020). For more details, check out my research topics and academic c.v.

Publications show selected / show all by date / show all by topic

Topics: Syntax / Semantics / Multilingual NLP / Others (*: Equal Contribution)

Audio-Visual Neural Syntax Acquisition
Cheng-I Jeff Lai*, Freda Shi*, Puyuan Peng*, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass

ASRU 2023 Paper / Code / arXiv

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole et al.

NEJLT 2023 Paper / Code / arXiv

Large Language Models Can Be Easily Distracted by Irrelevant Context
Freda Shi*, Xinyun Chen*, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, Denny Zhou

ICML 2023 Paper / arXiv / Data

Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi*, Mirac Suzgun*, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

ICLR 2023 Paper / arXiv / Data

InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried*, Armen Aghajanyan*, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

ICLR 2023 Paper / Code / arXiv

Natural Language to Code Translation with Execution
Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang

EMNLP 2022 Paper / Code / arXiv

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing
Freda Shi, Kevin Gimpel, Karen Livescu

ACL 2022 Paper / Code / arXiv

Deep Clustering of Text Representations for Supervision-Free Probing of Syntax
Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

AAAI 2022 Paper / arXiv

Grammar-Based Grounded Lexicon Learning
Jiayuan Mao, Haoyue Shi, Jiajun Wu, Roger P. Levy, Joshua B. Tenenbaum

NeurIPS 2021 Paper / arXiv / Project Page

Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
Haoyue Shi, Luke Zettlemoyer, Sida I. Wang

ACL-IJCNLP 2021  Best Paper Nominee    Paper / Code / arXiv

Substructure Substitution: Structured Data Augmentation for NLP
Haoyue Shi, Karen Livescu, Kevin Gimpel

Findings of ACL-IJCNLP 2021 Paper / Code / arXiv

On the Role of Supervision in Unsupervised Constituency Parsing
Haoyue Shi, Karen Livescu, Kevin Gimpel

EMNLP 2020 Paper / arXiv

A Cross-Task Analysis of Text Span Representations
Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel

RepL4NLP 2020 Paper / Code / arXiv

Visually Grounded Neural Syntax Acquisition
Haoyue Shi*, Jiayuan Mao*, Kevin Gimpel, Karen Livescu

ACL 2019  Best Paper Nominee    Paper / Code / arXiv

On Tree-Based Neural Sentence Modeling
Haoyue Shi, Hao Zhou, Jiaze Chen, Lei Li

EMNLP 2018 Paper / Code / arXiv

On Multi-Sense Word Embeddings via Matrix Factorization and Matrix Transformation
Haoyue Shi

B.S. Thesis (in Simplified Chinese), Peking University School of EECS, May 2018    Paper
Best Undergraduate Dissertation Award, PKU School of EECS

Learning Visually-Grounded Semantics from Contrastive Adversarial Samples
Haoyue Shi*, Jiayuan Mao*, Tete Xiao*, Yuning Jiang, Jian Sun

COLING 2018 Paper / Code / arXiv

Constructing High Quality Sense-Specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-Sense
Haoyue Shi, Xihao Wang, Yuqi Sun, Junfeng Hu

LREC 2018 Paper / Code

Joint Saliency Estimation and Matching using Image Regions for Geo-Localization of Online Video
Haoyue Shi, Jia Chen, Alexander G. Hauptmann

ICMR 2017 Paper

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation
Haoyue Shi, Caihua Li, Junfeng Hu

COLING Workshop CL4LC 2016 Paper / arXiv

Working Papers

Structured Tree Alignment for Evaluation of Constituency Parsing
Freda Shi, Kevin Gimpel, Karen Livescu