Neural Machine Translation of Rare Words with Subword Units

AI-generated keywords: Neural Machine Translation Subword Units Word Segmentation Rare Words Unknown Words

AI-generated Key Points

  • Authors propose a new approach for translating rare and unknown words in NMT models
  • Certain word classes can be translated more effectively through smaller units rather than whole words
  • Different word segmentation techniques are discussed
  • Subword models outperform a back-off dictionary baseline in English-German and English-Russian translation tasks
  • Analysis of 100 rare tokens in German training data shows that the majority can potentially be translated using smaller units
  • Segmenting rare words into appropriate subword units is sufficient for the NMT model to learn transparent translations and generalize this knowledge to translate unseen words
  • Related work on handling unknown words in statistical machine translation is discussed
  • Proposed approach offers a simpler and more effective solution for translating rare and unknown words in NMT models.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Rico Sennrich, Barry Haddow, Alexandra Birch

accepted at ACL 2016; new in this version: figure 3
License: CC BY 4.0

Abstract: Neural machine translation (NMT) models typically operate with a fixed vocabulary, but translation is an open-vocabulary problem. Previous work addresses the translation of out-of-vocabulary words by backing off to a dictionary. In this paper, we introduce a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units. This is based on the intuition that various word classes are translatable via smaller units than words, for instance names (via character copying or transliteration), compounds (via compositional translation), and cognates and loanwords (via phonological and morphological transformations). We discuss the suitability of different word segmentation techniques, including simple character n-gram models and a segmentation based on the byte pair encoding compression algorithm, and empirically show that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.1 and 1.3 BLEU, respectively.

Submitted to arXiv on 31 Aug. 2015

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1508.07909v5

In this paper, the authors propose a new approach for translating rare and unknown words in neural machine translation (NMT) models. They argue that certain word classes can be translated more effectively through smaller units rather than whole words and discuss different word segmentation techniques. Empirically demonstrating that subword models outperform a back-off dictionary baseline in English-German and English-Russian translation tasks, the authors analyze 100 rare tokens in their German training data and find that the majority of these tokens can potentially be translated from English using smaller units. The paper provides empirical support for the hypothesis that segmenting rare words into appropriate subword units is sufficient for the NMT model to learn transparent translations and generalize this knowledge to translate unseen words. Additionally, related work on handling unknown words in statistical machine translation is discussed. Overall, the proposed approach offers a simpler and more effective solution for translating rare and unknown words in NMT models.
Created on 20 Nov. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.