A Better Variant of Self-Critical Sequence Training

Ruotian Luo

March 2020

Abstract

In this work, we present a simple yet better variant of Self-Critical Sequence Training. We make a simple change in the choice of baseline function in REINFORCE algorithm. The new baseline can bring better performance with no extra cost, compared to the greedy decoding baseline.

Type

Report

Publication

Technical Report

Ruotian Luo

Software Engineer at Waymo Perception

My research interests include computer vision, natural language processing, artificial intelligence.