Project results – Implementation of Factored Translation

Code and documentation on Marian usage with factors have been published. With this improvement user now is able to train a model with source and/or target-side factors. Factors carry side information about a token in the sentence, which can be used for a variety of purposes including implementing forced translation. In neural machine translation, factors have been successfully used to annotate linguistic features such as morphology, part of speech, and syntax on the target and source sides. Moreover, factors encode positional information in the Transformer model. To train with factors, the data must be formatted in a certain way. A special vocabulary file format is also required.