Ruotian Luo

Software Engineer at Waymo Perception



I am now a Software Engineer at Waymo Perception.

I received my Ph.D. degree from Toyota Technological Institute at Chicago (TTIC) in 2021, advised by Prof. Greg Shakhnarovich. Before that, I got my Bachelor degree from Shanghai Jiao Tong University IEEE Honor Class, where I worked with Prof. Yuncai Liu and Prof. Xinbing Wang.

My research interests involved Computer Vision and its combination with Language.

My CV is here.

  • Ph.D. in Computer Science, 2015-2021

    Toyota Technological Institute at Chicago

  • B.Eng. in Computer Science, 2011-2015

    Shanghai Jiao Tong University


2021/9: I am joining Waymo Perception as a Software engineer.

2021/8: I passed my thesis defense. My thesis can be visited here.

2021/5: I am recognized as an “Outstanding Reviewer” for CVPR 2021!

2020/6/1: I will present our project “Controlling Length in Image Captioning” at VQA workshop this year. The model used is a little bit behind the time because it was mostly done a year ago. Picked it up because it could fit in my thesis.

2020/3/27: Our paper “Detection and Description of Change in Visual Streams” is on arxiv now.

2020/3/22: My techinical report “A Better Variant of Self-Critical Sequence Training” is on arxiv now. It is a simple yet effective improvement upon SCST.

2020/1/24: Our paper Pixel Consensus Voting for Panoptic Segmentation is accepted by CVPR 2020. Arxiv link is here.

2019/11/10: Our paper “Context-Aware Zero-Shot Recognition” is accepted by AAAI 2020.

2019/10/27: Our PCV team wins Innovation Award on COCO panoptic segmentation track of COCO + Mapillary Challenge Workshop at ICCV 2019.

2019/10/23: I am collaborating with Hang Chu(Main contributor) on a podcast Daily Arxiv Radiostation. This podcast uses TTS to read the paper (title and abstracts) that algorithm picks, everyday. Welcome subscriptions. Chinese Introduction here.

2019/10/02: Our paper “Analysis of diversity-accuracy tradeoff in image captioning” will be presented at ICCV2019 CLVL workshop.

2019/08/01: Our high-resolution RGB-D dataset is released. More details can be found at DIODE. (unrealisticly accurate depth map and surface normal)

2019/06/14: I am invited to give a talk at Conceptual Captions Challenge Workshop at CVPR 2019 as the challenge winner team. The slides can be found here.

2019/04/24: Our paper “Context-Aware Zero-Shot Recognition” is on arxiv now. Code is available at [link].

2018/06/11: This year I am having an internship at Snap Research, working with Linjie Yang, Ning Zhang and Bohyung Han.

2018/02/20: Paper “Discriminability objective for training descriptive captions” is accepted to CVPR 2018. [link]

2017/06/12: I start my internship at Adobe Research, working with Scott Cohen and Brian Price. First time in Bay area.

2017/02/27: My recent work “Comprehension-guided referring expressions” is accepted by CVPR 2017. [link]

2015/12/14: I build a blog on github. Link to the blog.

2015/09/21: I start my new life in TTI-C and in Chicago.

2015/04/04: I have accepted the Ph.D. admission from Toyota Technological Institute at Chicago. See you in Chicago!


This is how to pronounce my name in Madarin (in the order of Last Name + First name) . People usually call me RT (The initials of my first name).

My Zhihu account.