Ruotian Luo

Ruotian Luo

Software Engineer at Waymo Perception



I am now a Software Engineer at Waymo Perception.

I received my Ph.D. degree from Toyota Technological Institute at Chicago (TTIC) in 2021, advised by Prof. Greg Shakhnarovich. Before that, I got my Bachelor degree from Shanghai Jiao Tong University IEEE Honor Class, where I worked with Prof. Yuncai Liu and Prof. Xinbing Wang.

My research interests involved Computer Vision and its combination with Language.

My CV is here.

  • Ph.D. in Computer Science, 2015-2021

    Toyota Technological Institute at Chicago

  • B.Eng. in Computer Science, 2011-2015

    Shanghai Jiao Tong University


2021/9: I am joining Waymo Perception as a Software engineer.

2021/8: I passed my thesis defense. My thesis can be visited here.

2021/5: I am recognized as an “Outstanding Reviewer” for CVPR 2021!

2020/6/1: I will present our project “Controlling Length in Image Captioning” at VQA workshop this year. The model used is a little bit behind the time because it was mostly done a year ago. Picked it up because it could fit in my thesis.

2020/3/27: Our paper “Detection and Description of Change in Visual Streams” is on arxiv now.

2020/3/22: My techinical report “A Better Variant of Self-Critical Sequence Training” is on arxiv now. It is a simple yet effective improvement upon SCST.

2020/1/24: Our paper Pixel Consensus Voting for Panoptic Segmentation is accepted by CVPR 2020. Arxiv link is here.

2019/11/10: Our paper “Context-Aware Zero-Shot Recognition” is accepted by AAAI 2020.

2019/10/27: Our PCV team wins Innovation Award on COCO panoptic segmentation track of COCO + Mapillary Challenge Workshop at ICCV 2019.

2019/10/23: I am collaborating with Hang Chu(Main contributor) on a podcast Daily Arxiv Radiostation. This podcast uses TTS to read the paper (title and abstracts) that algorithm picks, everyday. Welcome subscriptions. Chinese Introduction here.

2019/10/02: Our paper “Analysis of diversity-accuracy tradeoff in image captioning” will be presented at ICCV2019 CLVL workshop.

2019/08/01: Our high-resolution RGB-D dataset is released. More details can be found at DIODE. (unrealisticly accurate depth map and surface normal)

2019/06/14: I am invited to give a talk at Conceptual Captions Challenge Workshop at CVPR 2019 as the challenge winner team. The slides can be found here.

2019/04/24: Our paper “Context-Aware Zero-Shot Recognition” is on arxiv now. Code is available at [link].

2018/06/11: This year I am having an internship at Snap Research, working with Linjie Yang, Ning Zhang and Bohyung Han.

2018/02/20: Paper “Discriminability objective for training descriptive captions” is accepted to CVPR 2018. [link]

2017/06/12: I start my internship at Adobe Research, working with Scott Cohen and Brian Price. First time in Bay area.

2017/02/27: My recent work “Comprehension-guided referring expressions” is accepted by CVPR 2017. [link]

2015/12/14: I build a blog on github. Link to the blog.

2015/09/21: I start my new life in TTI-C and in Chicago.

2015/04/04: I have accepted the Ph.D. admission from Toyota Technological Institute at Chicago. See you in Chicago!


(2021). Goal-driven text descriptions for images. Thesis.


(2020). Controlling Length in Image Captioning. CVPRW.

PDF Code Slides Video

(2020). Detection and Description of Change in Visual Streams. arxiv.


(2020). Pixel Consensus Voting for Panoptic Segmentation. CVPR.

PDF Code Project Slides

(2020). Context-Aware Zero-Shot Recognition. AAAI.

PDF Code Poster

(2019). DIODE: A Dense Indoor and Outdoor DEpth Dataset. arxiv preprint.

PDF Dataset Project

(2018). Discriminability objective for training descriptive captions. CVPR.

PDF Code Poster Slides

(2018). A Multi-task Learning Approach for Image Captioning. IJCAI.

PDF Code

(2017). Comprehension-guided referring expressions. CVPR.

PDF Poster

(2016). Person Re-identification by encoding free energy feature maps. Multimedia Tools and Applications.


(2014). Are we still friends: Kernel multivariate survival analysis. GLOBECOM.




This is how to pronounce my name in Madarin (in the order of Last Name + First name) . People usually call me RT (The initials of my first name).

My Zhihu account.