Bowen Shi
PhD student, |
I am a PhD student working under advisement of Prof. Karen Livescu at TTI-Chicago. I also work with Prof. Greg Shakhnarovich and Prof. Diane Brentari. Before coming to TTIC, I spent three wonderful years at ENSTA ParisTech and UPMC as a Master student in computer science. I obtained my Bachelor degree from Shanghai Jiaotong University in 2013.
My research interests are in the applications of machine learning to speech recognition and computer vision. In particular, I am interested in sign language and audio-visual speech processing.
A Single Self-Supervised Model for Many Speech Modalities Enables Zero-Shot Modality Transfer
Wei-Ning Hsu and Bowen Shi
arXiv:2207.07036
Open-Domain Sign Language Translation Learned from Online Video
Bowen Shi, Diane Brentari, Greg Shakhnarovich, and Karen Livescu
arXiv:2205.12870
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Bowen Shi, Abdelrahman Mohamed and Wei-Ning Hsu
Interspeech 2022
Robust Self-Supervised Audio-Visual Speech Recognition
Bowen Shi, Wei-Ning Hsu, and Abdelrahman Mohamed
Interspeech 2022
Searching for fingerspelled content in American Sign Language
Bowen Shi, Diane Brentari, Greg Shakhnarovich, and Karen Livescu
ACL 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi, Wei-Ning Hsu, Kushal Lakhotia, and Abdelrahman Mohamed
ICLR 2022
Fingerspelling Detection in American Sign Language
Bowen Shi, Diane Brentari, Greg Shakhnarovich, and Karen Livescu
CVPR 2021
Whole-word segmental speech recognition with acoustic word embeddings
Bowen Shi, Shane Settle, Karen Livescu
SLT 2021
A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling
Chieh-Chi Kao, Bowen Shi, Ming Sun, and Chao Wang
Interspeech 2020
Few-shot acoustic event detection via meta-learning
Bowen Shi, Ming Sun, Krishna C. Puvvada, Chieh-Chi Kao, Spyros Matsoukas, and Chao Wang
ICASSP 2020
Fingerspelling recognition in the wild with iterative visual attention
Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Diane Brentari, Greg Shakhnarovich, and Karen Livescu
ICCV 2019
Compression of acoustic event detection models with quantized distillation
Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, and Chao Wang
Interspeech 2019
On the contributions of visual and textual supervision in low-resource semantic speech retrieval
Ankita Pasad, Bowen Shi, Herman Kamper, and Karen Livescu
Interspeech 2019
Semi-supervised acoustic event detection based on tri-training
Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, and Chao Wang
ICASSP 2019
Compression of acoustic event detection models with low-rank matrix factorization and quantization Training
Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, and Chao Wang
NeuralPS 2018 CDNNRIA workshop
American Sign Language fingerspelling recognition in the wild
Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Jonathan Michaux, Diane Brentari, Greg Shakhnarovich, and Karen Livescu
SLT 2018
Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition
Bowen Shi and Karen Livescu
ASRU 2017
Offloading guidelines for augmented reality applications on wearable devices
Bowen Shi, Ji Yang, Zhanpeng Huang, and Pan Hui.
ACM Multimedia 2015
TTIC 31220: Unsupervised learning and data analysis, Winter 2019