Deep Reinforcement Learning for Active High Frequency Trading

AI-generated keywords: Deep Reinforcement Learning High Frequency Trading Limit Order Book Proximal Policy Optimization Sequential Model Based Optimization

AI-generated Key Points

End-to-end deep reinforcement learning (DRL) framework introduced for active high frequency trading
DRL agents trained to trade one unit of Intel Corporation stocks using Proximal Policy Optimization algorithm
Training conducted on three contiguous months of high-frequency Limit Order Book (LOB) data, test carried out on following month of data
Only samples with largest price changes selected to maximize signal-to-noise ratio in training data
Hyperparameters tuned using Sequential Model Based Optimization technique
Three different state characterizations considered that differ in their LOB-based meta-features
Agents learn trading strategies that produce stable positive returns despite highly stochastic and non-stationary environment
Agents create dynamic representation of underlying environment by highlighting occasional regularities present in data and exploiting them to create long-term profitable trading strategies
Number of trades reflected in histogram plots drives difference in return profiles between different trading state characterizations, not individual trade quality or profitability
DRL can be effectively applied to high-frequency trading tasks using LOB-based meta-features as input states for agents' policies.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Antonio Briola, Jeremy Turiel, Riccardo Marcaccioli, Tomaso Aste

arXiv: 2101.07107v1 - DOI (cs.LG)

8 pages, 4 figures

License: CC BY-NC-SA 4.0

Abstract: We introduce the first end-to-end Deep Reinforcement Learning based framework for active high frequency trading. We train DRL agents to to trade one unit of Intel Corporation stocks by employing the Proximal Policy Optimization algorithm. The training is performed on three contiguous months of high frequency Limit Order Book data. In order to maximise the signal to noise ratio in the training data, we compose the latter by only selecting training samples with largest price changes. The test is then carried out on the following month of data. Hyperparameters are tuned using the Sequential Model Based Optimization technique. We consider three different state characterizations, which differ in the LOB-based meta-features they include. Agents learn trading strategies able to produce stable positive returns in spite of the highly stochastic and non-stationary environment, which is remarkable itself. Analysing the agents' performances on the test data, we argue that the agents are able to create a dynamic representation of the underlying environment highlighting the occasional regularities present in the data and exploiting them to create long-term profitable trading strategies.

Submitted to arXiv on 18 Jan. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2101.07107v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

This paper introduces an end-to-end deep reinforcement learning (DRL) framework for active high frequency trading. The authors train DRL agents to trade one unit of Intel Corporation stocks using the Proximal Policy Optimization algorithm. The training is conducted on three contiguous months of high-frequency Limit Order Book (LOB) data, and the test is carried out on the following month of data. To maximize the signal-to-noise ratio in the training data, only samples with the largest price changes are selected. Hyperparameters are tuned using Sequential Model Based Optimization technique. The authors consider three different state characterizations that differ in their LOB-based meta-features. Despite the highly stochastic and non-stationary environment, agents learn trading strategies that produce stable positive returns, which is remarkable itself. Analyzing the agents' performances on test data, they argue that agents create a dynamic representation of the underlying environment by highlighting occasional regularities present in data and exploiting them to create long-term profitable trading strategies. When looking at the body of distributions and their tails, there is no strong evidence to justify widely different return profiles between different trading state characterizations. What seems to really drive this is the number of trades reflected in histogram plots. Agents with knowledge of their mark-to-market have exact information about their potential reward and only need to evaluate whether further upside potential is worth keeping positions open. This allows for more trading opportunities leading to an increase in profit and loss with no apparent effect on individual trade quality or profitability. This suggests a characteristic trade and return horizon throughout strategies, perhaps characteristic of asset price dynamics. In conclusion, this work demonstrates that DRL can be effectively applied to high-frequency trading tasks using LOB-based meta-features as input states for agents' policies.

- End-to-end deep reinforcement learning (DRL) framework introduced for active high frequency trading
- DRL agents trained to trade one unit of Intel Corporation stocks using Proximal Policy Optimization algorithm
- Training conducted on three contiguous months of high-frequency Limit Order Book (LOB) data, test carried out on following month of data
- Only samples with largest price changes selected to maximize signal-to-noise ratio in training data
- Hyperparameters tuned using Sequential Model Based Optimization technique
- Three different state characterizations considered that differ in their LOB-based meta-features
- Agents learn trading strategies that produce stable positive returns despite highly stochastic and non-stationary environment
- Agents create dynamic representation of underlying environment by highlighting occasional regularities present in data and exploiting them to create long-term profitable trading strategies
- Number of trades reflected in histogram plots drives difference in return profiles between different trading state characterizations, not individual trade quality or profitability
- DRL can be effectively applied to high-frequency trading tasks using LOB-based meta-features as input states for agents' policies.

Summary: A computer program was made to help people trade stocks. The program learned how to make good trades by practicing on data from the stock market. It only used the most important information to learn and tried different ways of trading until it found a good one. The program was able to make money even though the stock market is always changing. Definitions: - End-to-end deep reinforcement learning (DRL) framework: A type of computer program that learns how to do something by trying different things and getting feedback on what works best. - Proximal Policy Optimization algorithm: A specific way of training a DRL agent. - High-frequency Limit Order Book (LOB) data: Information about stock prices and orders that is updated very quickly. - Hyperparameters: Settings in the computer program that can be adjusted to improve its performance. - Sequential Model Based Optimization technique: A way of finding the best hyperparameter settings for a computer program. - State characterizations: Different ways of describing the current state or situation in the stock market. - Stochastic and non-stationary environment: The stock market is always changing, so it's hard to predict what will happen next. - Histogram plots: Graphs that show how many times something happened at different levels or values.

Error: needs to be re-run

Created on 20 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

61.9%

Storehouse: a Reinforcement Learning Environment for Optimizing Warehouse Man…

cs.LG

61.3%

Optimizing Market Making using Multi-Agent Reinforcement Learning

q-fin.TR

61.0%

Quantitative Trading using Deep Q Learning

q-fin.TR

59.7%

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

cs.LG

59.6%

Deep Reinforcement Learning for Cyber Security

cs.CR

56.9%

Predicting Stock Price Movement as an Image Classification Problem

q-fin.PR

56.8%

Optimal Asset Allocation in a High Inflation Regime: a Leverage-feasible Neur…

q-fin.PM

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.