Deep Learning for Trading Part 1: Can it Work?

Posted on Jan 01, 2018 by Kris Longmore
3 comments
10,990 Views

This is the first in a multi-part series  in which we explore and compare various deep learning tools and techniques for market forecasting using Keras and TensorFlow. In this post, we introduce Keras and discuss some of the major obstacles to using deep learning techniques in trading systems, including a warning about attempting to extract meaningful signals from historical market data.
Part 2 provides a walk-through of setting up Keras and Tensorflow for R using either the default CPU-based configuration, or the more complex and involved (but well worth it) GPU-based configuration under the Windows environment.

In the last few years, deep learning has gone from being an interesting but impractical academic pursuit to a ubiquitous technology that touches many aspects of our lives on a daily basis – including in the world of trading. This meteoric rise has been fuelled by a perfect storm of:

  • Frequent breakthroughs in deep learning research which regularly provide better tools for training deep neural networks
  • An explosion in the quantity and availability of data
  • The availability of cheap and plentiful compute power
  • The rise of open source deep learning tools that facilitate both the practical application of the technology and innovative research that drives the field ever forward

Deep learning excels at discovering complex and abstract patterns in data and has proven itself on tasks that have traditionally required the intuitive thinking of the human brain to solve. That is, deep learning is solving problems that have thus far proven beyond the ability of machines.
Therefore, it is incredibly tempting to apply deep learning to the problem of forecasting the financial markets. And indeed, certain research indicates that this approach has potential. For example, the Financial Hacker found an edge in predicting the EUR/USD exchange rate using a deep architecture stacked with an autoencoder. Here at Robot Wealth, we compared the performance of numerous machine learning algorithms on a financial prediction task, and deep learning was the clear outperformer.

Not so fast…

However, as anyone who has used deep learning in a trading application can attest, the problem is not nearly as simple as just feeding some market data to an algorithm and using the predictions to make trading decisions. Some of the common issues that need to be solved include:

  1. Working out a sensible way to frame the forecasting problem, for example as a classification or regression problem.
  2. Scaling data in a way that facilitates training of the deep network.
  3. Deciding on an appropriate network architecture.
  4. Tuning the hyperparameters of the network and optimization algorithm such that the network converges sensibly and efficiently. Depending on the architecture chosen, there might be a couple of dozen hyperparameters that affect the model, which can provide a significant headache.
  5. Coming up with a cost function that is applicable to the problem.
  6. Dealing with the problem of an ever-changing market. Market data tends to be non-stationary, which means that a network trained on historical data might very well prove useless when used with future data.
  7. There may be very little signal in historical market data with respect to the future direction of the market. This makes sense intuitively if you consider that the market is impacted by more than just its historical price and volume. Further, pretty much everyone who trades a particular market will be looking at its historical data and using it in some way to inform their trading decisions. That means that market data alone may not give an individual much of a unique edge.

The first five issues listed above are common to most machine learning problems and their resolution represents a big part of what applied data science is all about. The implication is that while these problems are not trivial, they are by no means deal breakers.
On the other hand, problems 6 and 7 may very well prove to thwart the best attempts at using deep learning to turn past market data into profitable trading signals. No machine learning algorithm or artificial intelligence can make good future predictions if its training data has no relationship to the target being predicted, or if that relationship changes significantly over time.1

Said differently, feeding market data to a machine learning algorithm is only useful to the extent that the past is a predictor of the future. And we all know what they say about past performance and future returns.

In deep learning trading systems that I’ve taken to market, I’ve always used additional data, not just historical, regularly sampled price and volume data and transformations thereof. While there does appear to be a slim edge in using deep learning to extract signals from past market data, that edge may not be significant enough to overcome transaction costs. And even if it does, it may not be significant enough to justify the risk and effort required to take it to market. On the other hand, supplementing historical market data with innovative, uncommon data sets has proven more effective – at least in my experience. 2
In this series of posts, we explore and compare various deep learning tools and techniques in relation to market forecasting using the Keras package. We will do so using only historical market data, so the results need to be interpreted considering the discussion above.
We expect deep learning to uncover a slim edge using historical market data, but the purpose of this analysis is to compare different deep learning tools in relation to market forecasting, not necessarily to build a market-beating trading system. That I leave to you – perhaps you can supplement the models we explore here with some creative or uncommon data or other tools to find a real edge.

What is Keras?

Keras is a high-level API for building and training neural networks. Its strength lies in its ability to facilitate fast and efficient research, which of course is very important for systematic traders, particularly those of the DIY persuasion for whom time is often the limiting factor to success. Keras is easy to learn and its syntax is particularly friendly. Keras also plays nicely with CPUs and GPUs and can integrate with the TensorFlow, Theano and CNTK backends – without limiting the flexibility of those tools. For example, pretty much anything you can implement in raw TensorFlow, you can also implement in Keras, likely at a fraction of the development effort.
Keras is also implemented in R, which means that we can use it directly in any trading algorithm developed on the Zorro Automated Trading Platform, since Zorro has seamless integration with an R session.3

What’s next?

In the deep learning experiments that follow in Part 2 and beyond, we’ll use the R implementation of Keras with TensorFlow backend. We’ll be exploring fully connected feedforward networks, various recurrent architectures including the Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM), and even convolutional neural networks which normally find application in computer vision and image classification.
Stay tuned.

Learn how to trade your way to an automated side income with algos

  • Discover the shortest, step-by-step path to profitable algo trading
  • Learn the core skills 99% of profitable algo traders share, minus the filler
  • Get around "Simulation vs Reality" and finally exit Practice Mode

(3) Comments

[…] Deep Learning for Trading: Part 1 [Robot Wealth] […]

xyz
January 19, 2018 at 8:51 pm

Hi
Yann LeCun said ” Generative adversarial networks ( GANs )most interesting idea in the last 10 years” .Ian Goodfellow the creator of  GANs said something like GANs are better than neural networks trained by monte carlo markov chains  models . Are they suiatable for trading ?Do you have any experience with them?

January 20, 2018 at 1:35 pm

I do have experience with GANs, but not in relation to trading. A GAN essentially consists of two networks, one which generates artificial samples from the real samples with the goal of tricking the other network into classifying a fake sample as a real one. The theory is that as training progresses, the generative network learns to produce more and more realistic samples, while the classifying network gets better and better at spotting fakes. There might be an application to market forecasting, but it doesn’t immediately jump out at me. Further, GANs are notoriously difficult to tune and I shudder to think of the effort required to get them to behave nicely on market data!
Also, I believe LeCunn’s comments are three or four years old now, and the field has come a very long way since then!
Finally, a question for you – what makes a neural network trained using MCMC “good”? That seems a long and convoluted way to find a good set of weights when gradient descent and it’s variations usually work just fine!

Leave a Comment