Workshop on Machine Learning in Speech and Language Processing

September 13, 2016
San Francisco, CA, USA
Speaker: Jan Chorowski (University of Wrocław)

Title: End-to-end approaches to speech recognition and language processing

Abstract:
End-to-end techniques solve complex machine learning tasks by building models that can be trained by optimizing a single joint loss criterion. Therefore all of a model's components collaborate to solve the task at hand. In this talk I will present attention-based recurrent neural networks that directly directly transcribe speech features into sequences of phonemes or characters. The networks learn the alignment between the speech and its transcription and are trained directly to optimize the probability of the correct transcription. I will show the advantages and challenges, such as language model integration, related to successful application of this family of neural networks. I will conclude the talk with a review of other applications of attention-based recurrent networks in NLP, such as parsing.