# R

This is the second in a multi-part series in which we explore and compare various deep learning tools and techniques for market forecasting using Keras and TensorFlow. In Part 1, we introduced Keras and discussed some of the major obstacles to using deep learning techniques in trading systems, including a warning about attempting to extract meaningful signals from historical market data. If you haven't read that article, it is highly recommended that you do so before proceeding, as the context it provides is important. Read Part 1 here. Part 2 provides a walk-through of setting up Keras and Tensorflow for R using either the default CPU-based configuration, or the more complex and involved (but well worth it) GPU-based configuration under the Windows environment. Stay tuned for Part 3 of this series which will be published next week. CPU vs GPU for Deep Learning No doubt you know that a computer's Central Processing Unit (CPU) is its primary computation module. CPUs are designed and optimized for rapid computation on small amounts of data and as such, elementary arithmetic operations on a few numbers...

This article is adapted from one of the units of Advanced Algorithmic Trading. If you like what you see, check out the entire curriculum here. Find out what Robot Wealth is all about here. If you're interested in using artificial neural networks (ANNs) for algorithmic trading, but don't know where to start, then this article is for you. Normally if you want to learn about neural networks, you need to be reasonably well versed in matrix and vector operations - the world of linear algebra. This article is different. I've attempted to provide a starting point that doesn't involve any linear algebra and have deliberately left out all references to vectors and matrices. If you're not strong on linear algebra, but are curious about neural networks, then I think you'll enjoy this introduction. In addition, if you decide to take your study of neural networks further, when you do inevitably start using linear algebra, it will probably make a lot more sense as you'll have something of head start. The best place to start learning about neural networks is the...

Earlier this year, I attended the Google Next conference in San Francisco and gained some first-hand perspective into what’s possible with Google's cloud infrastructure. Since then, I’ve been leaning on Google Cloud Platform (GCP) to run my trading algorithms (and much more) and it has quickly become an important tool in my workflow! In this post, I’m going to show you how to set up a Google Cloud Platform compute instance to act as a server for hosting a trading algorithm. We'll also see why such a setup can be a good option and when it might pay to consider alternatives. Cloud compute instances are just a tiny fraction of the whole GCP ecosystem, so before we go any further, let's take a high-level overview of the various components that make up Google Cloud Platform. What is Google Cloud Platform? GCP consists of a suite of cloud storage, compute, analytics and development infrastructure and services. Google says that GCP runs on the very same infrastructure that Google uses for its own products, such as Google Search. This suite of services...

Recently, Yahoo Finance - a popular source of free end-of-day price data - made some changes to their server which wreaked a little havoc on anyone relying on it for their algos or simulations. Specifically, Yahoo Finance switched from HTTP to HTTPS and changed the data download URLs. No doubt this is a huge source of frustration, as many backtesting and trading scripts that relied on such data will no longer work. Users of the excellent R package quantmod however are in luck! The package's author, Joshua Ulrich, has already addressed the change in a development version of quantmod. You can update your quantmod package to the development version that addresses this issue using this command in R: devtools::install_github("joshuaulrich/quantmod", ref="157_yahoo_502") Of course, you need the devtools package installed, so do install.packages("devtools") first if you don't already have it installed. Once the package updates, quantmod::getSymbols(src = "yahoo") should work just as it did prior to the updates on the Yahoo Finance server. I can verify that this worked for me. Of course, if you don't want to update quantmod to a version that lives on...

Recently, I wrote about fitting mean-reversion time series analysis models to financial data and using the models' predictions as the basis of a trading strategy. Continuing our exploration of time series modelling, let's research the autoregressive and conditionally heteroskedastic family of time series models. In particular, we want to understand the autoregressive integrated moving average (ARIMA) and generalized autoregressive conditional heteroskedasticity (GARCH) models. Why? Well, they are both referenced frequently in the quantitative finance literature, and it's about time I got up to speed so why not join me! What follows is a summary of what I learned about these models, a general fitting procedure and a simple trading strategy based on the forecasts of a fitted model. Let's get started! What are these time series analysis models? Several definitions are necessary to set the scene. I don't want to reproduce the theory I've been wading through; rather here is my very high-level summary of what I've learned about time series modelling, in particular, the ARIMA and GARCH models and how they are related to their component models: At its...

In the first Mean Reversion and Cointegration post, I explored mean reversion of individual financial time series using techniques such as the Augmented Dickey-Fuller test, the Hurst exponent and the Ornstein-Uhlenbeck equation for a mean reverting stochastic process. I also presented a simple linear mean reversion strategy as a proof of concept. In this post, I’ll explore artificial stationary time series and will present a more practical trading strategy for exploiting mean reversion. Again this work is based on Ernie Chan's Algorithmic Trading, which I highly recommend and have used as inspiration for a great deal of my own research. Go easy on my design abilities... In presenting my results, I have purposefully shown equity curves from mean reversion strategies that go through periods of stellar performance as well as periods so bad that they would send most traders broke. Rather than cherry pick the good performance, I want to demonstrate what I think is of utmost importance in this type of trading, namely that the nature of mean reversion for any financial time series is constantly changing. At times this dynamism can...

This series of posts is inspired by several chapters from Ernie Chan's highly recommended book Algorithmic Trading. The book follows Ernie's first contribution, Quantitative Trading, and focuses on testing and implementing a number of strategies that exploit measurable market inefficiencies. I'm a big fan of Ernie's work and have used his material as inspiration for a great deal of my own research. My earlier posts about accounting for randomness (here and here) were inspired by the first chapter of Algorithmic Trading. Ernie works in MATLAB, but I'll be using R and Zorro. Ernie cites Daniel Kahneman's interesting example of mean reversion in the world around us: the Sports Illustrated jinx, namely that "an athlete whose picture appears on the cover of the magazine is doomed to perform poorly the following season" (Kahneman, 2011). Performance can be thought of as being randomly distributed around a mean, so exceptionally good performance one year (resulting in the appearance on the cover of Sports Illustrated) is likely to be followed by performances that are closer to the average. Mean reversion also exists in, or can be constructed from, financial time series...

Important preface: This post is in no way intended to showcase a particular trading strategy. It is purely to share and demonstrate the use of the framework I've put together to speed the research and development process for a particular type of trading strategy. Comments and critiques regarding the framework and the methodology used are most welcome. Backtest results presented are for illustrating the methodology and describing the outputs only. That done, on to the interesting stuff My last two posts (Part 1 here and Part 2 here) explored applying the k-means clustering algorithm for unsupervised discovery of candlestick patterns. The results were interesting enough (to me at least) to justify further research in this domain, but nothing presented thus far would be of much use in a standalone trading system. There are many possible directions in which this research could go. Some ideas that could be worth pursuing include: Providing the clustering algorithm with other data, such as trend or volatility information; Extending the search to include two- and three-day patterns; Varying the number of clusters; Searching across markets and asset...

In the last article, I described an application of the k-means clustering algorithm for classifying candlesticks based on the relative position of their open, high, low and close. This was a simple enough exercise, but now I tackle something more challenging: isolating information that is both useful and practical to real trading. I'll initially try two approaches: Investigate whether there are any statistically significant patterns in certain clusters following others Investigate the distribution of next day returns following the appearance of a candle from each cluster The insights gained from this analysis will hopefully inform the next direction of this research. Data preliminaries In the last article, I classified twelve months of daily candles (June 2014 - July 2015) into eight clusters. To simplify the analysis and ensure that enough instances of each cluster are observed, I'll reduce the number of clusters to four and extend the history to cover 2008-2015. I'll exclude my 2015 data for now in case I need a final, unseen test set at some point in the future. Here's a subset of the candles over the entire price history (2008-2014, 2015...

In the first part of this article, I described a procedure for empirically testing whether a trading strategy has predictive power by comparing its performance to the distribution of the performance of a large number of random strategies with similar trade distributions. In this post, I will present the results of the simple example described by the code in the previous post in order to illustrate how susceptible trading strategies are to the vagaries of randomness. I will also illustrate by way of example my thought process when it comes to deciding whether to include a particular component in my live portfolio or discard it. I tested one particular trading system on a number of markets separately in both directions. I picked out three instances where the out of sample performance was good as candidates for live trading. The markets, trade directions and profit factors obtained from the out of sample backtest are as follows: USD/CAD - Short - Profit Factor = 1.79 GBP/USD - Long - Profit Factor = 1.20 GBP/JPY - Long - Profit Factor = 1.31 Next, I estimated the performance of...