"An alternative approach to training Sequence-to-Sequence model for Mac" by Vivek Sah

Author (Your Name)

Vivek Sah, Colby CollegeFollow

Date of Award

2017

Document Type

Honors Thesis (Open Access)

Department

Colby College. Computer Science Dept.

Advisor(s)

Stephanie Taylor

Abstract

Machine translation is a widely researched topic in the field of Natural Language Processing and most recently, neural network models have been shown to be very effective at this task. The model, called sequence-to-sequence model, learns to map an input sequence in one language to a vector of fixed dimensionality and then map that vector to an output sequence in another language without any human intervention provided that there is enough training data. Focusing on English-French translation, in this paper, I present a way to simplify the learning process by replacing English input sentences by word-by-word translation of those sentences. I found that this approach improves the performance of a sequence-to-sequence model which is 3-layer deep and has a bidirectional LSTM encoder by more than 30% on the same dataset.

Keywords

deep learning, translation, sequence to sequence, neural networks, recurrent neural networks, machine translation

Share

COinS