Linearizing Transformer with Key-Value Memory Bank

AI-generated keywords: MemSizer

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Paper introduces MemSizer, a new approach for addressing computational overhead of vanilla transformer in NLP tasks
  • Vanilla transformer's complexity scales quadratically with sequence length
  • Previous work like Linformer achieves linear time complexity but not suitable for text generation tasks
  • MemSizer proposes different perspective on attention mechanism and projects source sequence into lower dimension representation
  • MemSizer can handle input sequences with dynamic lengths, making it more suitable for text generation tasks
  • MemSizer achieves linear time complexity and offers efficient recurrent-style autoregressive generation
  • Constant memory complexity and reduced computation during inference
  • MemSizer strikes improved balance between efficiency and accuracy compared to vanilla transformer and other linear variants in language modeling and machine translation tasks
  • MemSizer presented as efficient alternative to vanilla transformer by leveraging key-value memory banks and offering dynamic length support for text generation tasks
  • Experimental results showcase effectiveness of MemSizer in achieving better tradeoffs between efficiency and accuracy compared to existing approaches.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yizhe Zhang, Deng Cai

Work in progress

Abstract: Transformer has brought great success to a wide range of natural language processing tasks. Nevertheless, the computational overhead of the vanilla transformer scales quadratically with sequence length. Many efforts have been made to develop more efficient transformer variants. A line of work (e.g., Linformer) projects the input sequence into a low-rank space, achieving linear time complexity. However, Linformer does not suit well for text generation tasks as the sequence length must be pre-specified. We propose MemSizer, an approach also projects the source sequence into lower dimension representation but can take input with dynamic length, with a different perspective of the attention mechanism. MemSizer not only achieves the same linear time complexity but also enjoys efficient recurrent-style autoregressive generation, which yields constant memory complexity and reduced computation at inference. We demonstrate that MemSizer provides an improved tradeoff between efficiency and accuracy over the vanilla transformer and other linear variants in language modeling and machine translation tasks, revealing a viable direction towards further inference efficiency improvement.

Submitted to arXiv on 23 Mar. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2203.12644v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

, , , , The paper titled "Linearizing Transformer with Key-Value Memory Bank" by Yizhe Zhang and Deng Cai introduces MemSizer, a new approach for addressing the computational overhead of the vanilla transformer in natural language processing tasks. The vanilla transformer has achieved great success, but its complexity scales quadratically with sequence length. Previous work such as Linformer has attempted to overcome this limitation by projecting the input sequence into a low-rank space, achieving linear time complexity. However, Linformer is not suitable for text generation tasks as it requires pre-specification of the sequence length. In contrast, MemSizer proposes a different perspective on the attention mechanism and projects the source sequence into a lower dimension representation. What sets MemSizer apart is its ability to handle input sequences with dynamic lengths, making it more suitable for text generation tasks. Similar to Linformer, MemSizer achieves linear time complexity but also offers efficient recurrent-style autoregressive generation. This results in constant memory complexity and reduced computation during inference. The authors demonstrate that MemSizer strikes an improved balance between efficiency and accuracy compared to both the vanilla transformer and other linear variants in language modeling and machine translation tasks. This highlights MemSizer as a viable direction for further improving inference efficiency in natural language processing. Overall, this paper presents MemSizer as an efficient alternative to the vanilla transformer by leveraging key-value memory banks and offering dynamic length support for text generation tasks. The experimental results showcase its effectiveness in achieving better tradeoffs between efficiency and accuracy compared to existing approaches.
Created on 10 Feb. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.