Deep Reinforcement Learning for Active High Frequency Trading

AI-generated keywords: Deep Reinforcement Learning High Frequency Trading Limit Order Book Proximal Policy Optimization Sequential Model Based Optimization

AI-generated Key Points

  • End-to-end deep reinforcement learning (DRL) framework introduced for active high frequency trading
  • DRL agents trained to trade one unit of Intel Corporation stocks using Proximal Policy Optimization algorithm
  • Training conducted on three contiguous months of high-frequency Limit Order Book (LOB) data, test carried out on following month of data
  • Only samples with largest price changes selected to maximize signal-to-noise ratio in training data
  • Hyperparameters tuned using Sequential Model Based Optimization technique
  • Three different state characterizations considered that differ in their LOB-based meta-features
  • Agents learn trading strategies that produce stable positive returns despite highly stochastic and non-stationary environment
  • Agents create dynamic representation of underlying environment by highlighting occasional regularities present in data and exploiting them to create long-term profitable trading strategies
  • Number of trades reflected in histogram plots drives difference in return profiles between different trading state characterizations, not individual trade quality or profitability
  • DRL can be effectively applied to high-frequency trading tasks using LOB-based meta-features as input states for agents' policies.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Antonio Briola, Jeremy Turiel, Riccardo Marcaccioli, Tomaso Aste

8 pages, 4 figures
License: CC BY-NC-SA 4.0

Abstract: We introduce the first end-to-end Deep Reinforcement Learning based framework for active high frequency trading. We train DRL agents to to trade one unit of Intel Corporation stocks by employing the Proximal Policy Optimization algorithm. The training is performed on three contiguous months of high frequency Limit Order Book data. In order to maximise the signal to noise ratio in the training data, we compose the latter by only selecting training samples with largest price changes. The test is then carried out on the following month of data. Hyperparameters are tuned using the Sequential Model Based Optimization technique. We consider three different state characterizations, which differ in the LOB-based meta-features they include. Agents learn trading strategies able to produce stable positive returns in spite of the highly stochastic and non-stationary environment, which is remarkable itself. Analysing the agents' performances on the test data, we argue that the agents are able to create a dynamic representation of the underlying environment highlighting the occasional regularities present in the data and exploiting them to create long-term profitable trading strategies.

Submitted to arXiv on 18 Jan. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2101.07107v1

This paper introduces an end-to-end deep reinforcement learning (DRL) framework for active high frequency trading. The authors train DRL agents to trade one unit of Intel Corporation stocks using the Proximal Policy Optimization algorithm. The training is conducted on three contiguous months of high-frequency Limit Order Book (LOB) data, and the test is carried out on the following month of data. To maximize the signal-to-noise ratio in the training data, only samples with the largest price changes are selected. Hyperparameters are tuned using Sequential Model Based Optimization technique. The authors consider three different state characterizations that differ in their LOB-based meta-features. Despite the highly stochastic and non-stationary environment, agents learn trading strategies that produce stable positive returns, which is remarkable itself. Analyzing the agents' performances on test data, they argue that agents create a dynamic representation of the underlying environment by highlighting occasional regularities present in data and exploiting them to create long-term profitable trading strategies. When looking at the body of distributions and their tails, there is no strong evidence to justify widely different return profiles between different trading state characterizations. What seems to really drive this is the number of trades reflected in histogram plots. Agents with knowledge of their mark-to-market have exact information about their potential reward and only need to evaluate whether further upside potential is worth keeping positions open. This allows for more trading opportunities leading to an increase in profit and loss with no apparent effect on individual trade quality or profitability. This suggests a characteristic trade and return horizon throughout strategies, perhaps characteristic of asset price dynamics. In conclusion, this work demonstrates that DRL can be effectively applied to high-frequency trading tasks using LOB-based meta-features as input states for agents' policies.
Created on 20 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.