Transformers are Sample Efficient World Models

AI-generated keywords: IRIS Transformers Reinforcement Learning Autoencoder Autoregressive

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Deep reinforcement learning agents have been limited in their application to real-world problems due to sample inefficiency
  • Model-based methods have been designed to address this issue, with learning in the imagination of a world model being one of the most prominent approaches
  • Ensuring that the world model is accurate over extended periods of time has been a challenge
  • IRIS is a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer
  • IRIS achieves remarkable results on the Atari 100k benchmark with only two hours of gameplay equivalent training time, outperforming humans on 10 out of 26 games and setting a new state-of-the-art for methods without lookahead search and even surpassing MuZero
  • The success of IRIS is attributed to its ability to learn from sequences efficiently using Transformers while also leveraging information from past interactions through its discrete autoencoder component
  • The authors release their codebase at https://github.com/eloialonso/iris to foster future research on Transformers and world models for sample-efficient reinforcement learning.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Vincent Micheli, Eloi Alonso, François Fleuret

Abstract: Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems. Recently, many model-based methods have been designed to address this issue, with learning in the imagination of a world model being one of the most prominent approaches. However, while virtually unlimited interaction with a simulated environment sounds appealing, the world model has to be accurate over extended periods of time. Motivated by the success of Transformers in sequence modeling tasks, we introduce IRIS, a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer. With the equivalent of only two hours of gameplay in the Atari 100k benchmark, IRIS achieves a mean human normalized score of 1.046, and outperforms humans on 10 out of 26 games. Our approach sets a new state of the art for methods without lookahead search, and even surpasses MuZero. To foster future research on Transformers and world models for sample-efficient reinforcement learning, we release our codebase at https://github.com/eloialonso/iris.

Submitted to arXiv on 01 Sep. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2209.00588v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Deep reinforcement learning agents have been limited in their application to real-world problems due to their sample inefficiency. To address this issue, many model-based methods have been designed, with learning in the imagination of a world model being one of the most prominent approaches. However, ensuring that the world model is accurate over extended periods of time has been a challenge. In this context, the authors introduce IRIS, a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer. The authors were motivated by the success of Transformers in sequence modeling tasks and sought to apply them to sample-efficient reinforcement learning. IRIS achieves remarkable results on the Atari 100k benchmark with only two hours of gameplay equivalent training time. It achieves a mean human normalized score of 1.046 and outperforms humans on 10 out of 26 games, setting a new state-of-the-art for methods without lookahead search and even surpassing MuZero. The success of IRIS is attributed to its ability to learn from sequences efficiently using Transformers while also leveraging information from past interactions through its discrete autoencoder component. By combining these two components, IRIS can accurately predict future states given past observations while also generating diverse imagined trajectories. To foster future research on Transformers and world models for sample-efficient reinforcement learning, the authors release their codebase at https://github.com/eloialonso/iris. Overall, this work demonstrates how transformers can be used effectively in deep reinforcement learning agents and opens up new avenues for research in this area.
Created on 04 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.