Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning

AI-generated keywords: Large language models (LLMs)

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Large language models (LLMs) have challenges in machine translation (MT)
  • LLM-based MT systems are often brittle and require post-processing
  • They rely heavily on few-shot examples
  • Finetuning on translation instructions is computationally expensive
  • The paper proposes adapter-based finetuning with LoRA as a solution
  • This method reduces training parameters by a factor of 50
  • It outperforms few-shot prompting and eliminates the need for post-processing or in-context examples
  • Finetuning generally degrades few-shot performance, limiting adaptation capabilities
  • The authors propose incorporating few-shot examples during finetuning to overcome this limitation
  • Experimental results on 10 language pairs show successful recovery of few-shot capabilities while retaining benefits of finetuning
  • The proposed approach improves effectiveness and efficiency of LLM-based MT systems.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Duarte M. Alves, Nuno M. Guerreiro, João Alves, José Pombal, Ricardo Rei, José G. C. de Souza, Pierre Colombo, André F. T. Martins

Accepted at EMNLP 2023 - Findings

Abstract: Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capabilities, due to overspecialization. In this paper, we provide a closer look at this problem. We start by showing that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50. This method also outperforms few-shot prompting and eliminates the need for post-processing or in-context examples. However, we show that finetuning generally degrades few-shot performance, hindering adaptation capabilities. Finally, to obtain the best of both worlds, we propose a simple approach that incorporates few-shot examples during finetuning. Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.

Submitted to arXiv on 20 Oct. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.13448v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

, , , , Large language models (LLMs) have shown promise in machine translation (MT), but current LLM-based MT systems face challenges. They are often brittle and require additional post-processing due to overgeneration, relying heavily on the choice of few-shot examples. Finetuning on translation instructions is a computationally expensive alternative, which may weaken in-context learning capabilities. To address these issues, this paper proposes a closer examination of the problem and introduces a novel approach. The authors first demonstrate that adapter-based finetuning with LoRA achieves comparable performance to traditional finetuning while reducing the number of training parameters by a factor of 50. This method also outperforms few-shot prompting and eliminates the need for post-processing or in-context examples. However, they find that finetuning generally degrades few-shot performance, limiting adaptation capabilities. To overcome this limitation, the authors propose a simple solution that incorporates few-shot examples during finetuning. Experimental results on 10 language pairs show that their proposed approach successfully recovers the original few-shot capabilities while retaining the benefits of finetuning. In summary, this paper addresses the challenges faced by LLM-based MT systems by introducing adapter-based finetuning with LoRA and incorporating few-shot examples during finetuning. The proposed approach achieves comparable performance to traditional finetuning while reducing training parameters and maintaining adaptation capabilities. These findings contribute to improving the effectiveness and efficiency of LLM-based MT systems.
Created on 07 Feb. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.