So You Want to Build Your Own Algo Trading System?


This post comes to you from Dr Tom Starke, a good friend of Robot Wealth. Tom is a physicist, quant developer and experienced algo trader with keen interests in machine learning and quantum computing. I am thrilled that Tom is sharing his knowledge and expertise with the Robot Wealth community. Over to you, Tom.

steps_06091Unlike any other business, algorithmic trading has the advantage of being independent of marketing, sales, customers and all those things that need the ‘pretty people’ to make it run. Also, you get almost instant feedback on how good you are in your business. For anyone who is numerically inclined (and more often than not falls into a particular demographic in terms of their social interactions) this is a very attractive proposition. I have seen articles written about this subject but they have never really addressed a lot of the issues I have come across on my journey. In this post I would like to talk about this a little as an inspiration or perhaps a deterrent for all the people who read this and consider making money that way.

Nothing could be more amazing, a system that runs by itself and consistently spits out cash to finance prolonged stays in Bali, South America or with your mom if that’s what you’re after. However, as you may have guessed, it ain’t that simple, even for the most hardened trade-savvy – math-wiz or programming genius. First of all, algo-trading is the multi-disciplinary by nature. This makes it both a lot of fun and challenging at the same time. You have to be fairly clued up in areas such as computer science, maths, data analysis, and some finance although I found that people who understand the first few subjects usually learn the finance bit very quickly.

Perhaps you might think that a lack of knowledge in some of these areas can be resolved by knowing the right people and teaming up with them. It may, but from my experience the odds of success aren’t great. I have personally met three guys with extensive computer knowledge that have teamed up with institutional traders to develop trading ideas into automated systems – none of them worked out.

Why is this so? In my experience, traders often work with technical analysis and respond to trading signals, which to them are “obvious” buys and sells and they often make good money from that. The problem here is that their awareness is very fine-tuned to see the relevant signals that have potential and they are almost blind to the ones that are there but obviously make no sense. In their view, the signals they use work 100%. A computer, however, does not have the ability to discern signals that make sense and the ones that don’t, it just executes regardless. As a result, the trader thinks that his computer guy made mistakes in the program and the programmer thinks that the strategies of the trader are no good, and finally they go their own way.

Take a typical institution, a quant develops a prototype of his system in, say MATLAB, and the IT guy converts it into C++ to be executed on the fund’s platform. A well-designed algo contains a lot of subtleties that the developer probably won’t understand exactly and tiny mistakes can completely kill the performance of the system.

These are common scenarios and I have had my fair share of that too. I’m all up for team work but the reality is that before you can outsource parts of your project you have to understand it yourself really well.

Now, say you come from a quantitative background and you read all the books, did the exercises, ran a bunch of backtests and you are ready to go. You start to build your own platform and realise that there are a number of show-stoppers such as getting good market data, historical data, broker connections and the like. These problems are more significant than you think. The easiest solution is to get a Bloomberg connection but it comes at a significant cost. Also, a typical Bloomberg terminal is not really built to run sophisticated number-crunching algos. How about Interactive Brokers? Yes, it can work but their API is far from perfect and their historical data are annoyingly full of errors and their market data have a lot of false signals too, which can be detrimental to your strategy. All this can be sorted out but trust me, it’s not easy. The little details create the complexity and it takes time to even learn about all the pitfalls that may come up.

But here’s the thing: if this stuff was easy, I would have never got into it because complexity creates opportunity. I like learning and I enjoy solving problems so this is a lot of fun. In the process I learned more than during my entire physics PhD … how cool is that?

How you get started depends entirely on your skills and preferences but allow me to give you some recommendations.

First of all, find the lowest commission broker you can. Commissions are the single biggest killer of an otherwise good strategy. One basis point (1 percent of 1 percent) does not sound like much but each extra basis point of commission that is added to your trade brings you into entirely different trading regimes. More commission means you have to wait longer to exit, your strategies are riskier and less consistent.

Second, don’t try to automate everything right away. Semi-automated strategies are actually really cool. What that means is that your machine supplies you with the signals and you carry out the execution by hand (provided that you are not executing 46 trades per second). To implement good automated execution strategies is actually quite tricky. What do you do when your trade does not get filled? Or it gets partially filled? Or you cannot exit at the price you want? Or you see that it might get filled soon, so is it ok to wait for a little while? Or do you want to cancel straight away. If you have done this by hand before it is much easier to develop something that accounts for all the scenarios you might encounter.

With semi-automated systems, if you stoically follow what your strategy tells you to do, it can work well, since you don’t have to worry about also watching the market at the same time. We all know the guys with 6 screens in front of them trying to make sense of what happens in the market – I have never seen a more anxious, sleep-deprived and adrenaline-ridden group of people than those. Often, they don’t even have a strategy and just react to what they think is rational.

Whilst developing your system you will realise that even the best fully automated system in the world will not allow you to lie on the beach, sip cool drinks, and relax while the trades tick away and make you money. Say, your strategy does actually run well and generates profit, but suddenly your internet cuts out or your server has a power surge. Does your system stall or smoothly reconnect and carry on? Do you have a backup server that runs in sync and can take over (not trivial)? Can you somehow close all your open positions through an alternative internet connection? Or maybe you do it the Warren Buffet way and leave your positions open for the next 24 years. What if your strategy experiences a large drawdown? Is it expected or does it go beyond the acceptable limit and you have to cut losses? Have you pre-determined at what point you do that?

This takes us to considerations in how you design your system. My journey and probably yours, will start with a simple design that just does what it’s meant to do but quite slowly and inefficiently. You move on to run bits in parallel, introduce complex event processing and end up running the different parts of your trader via sockets on external servers that exchange messages asynchronously. I’ve iterated through all of those myself and learned a decent amount of computer science along the way. If you are not already an experienced developer I highly recommend starting simple, rather than building a big system right away. By the way, that was one of my first humbling mistakes – oh well, we all learn. By now, I’ve built a wide variety of different systems from simple to complex and each one helped me to understand different aspects of the field.

It is also handy to backtest simulated execution, since some strategies significantly suffer when slippage, market impact and other factors are taken into account. Often, the tests for trading signals, portfolio and execution can be run separately, which simplifies the process and makes it easier to analyse and understand.

A big stumbling block, especially for people coming from computer science and finance, is the processing and statistical interpretation of the backtest data they have gathered. It isn’t trivial and it can take years of experience to get good at that. Data tell stories, learn to find those stories in the numbers.

If you have made it all the way to here and you are at the beginning of your journey you may feel a bit overwhelmed but think about it this way: there are all those amazing things you can learn on your way that will eventually enlighten you. If it was easy, everyone would do it.

If you ask me what is the most important aspect you need in order to get this to work, I’d say it’s vision. Vision of what you actually want to achieve. This, more than anything else will pull you in the right direction. In my honest opinion, it’s not the smartest people that succeed with this, it’s the visionaries. If you’re not sure what I’m talking about I suggest you have a look at the books by the likes of Charles Haanel, Napoleon Hill and Wallace D. Wattles. Whether you are a scientist, financial analyst, software engineer or hairdresser, you will have large gaps in your knowledge and it takes determination to fill them. So, go ahead, build you vision, put your sleeves up and get started!

Optimal Data Windows for Training a Machine Learning Model for Financial Prediction

It would be great if machine learning were as simple as just feeding data to an out-of-the box implementation of some learning algorithm, then standing back and admiring the predictive utility of the output. As anyone who has dabbled in this area will confirm, it is never that simple. We have features to engineer and transform (no trivial task – see here and here for an exploration with applications for finance), not to mention the vagaries of dealing with data that is non-Independent and Identically Distributed (non-IID). In my experience, landing on a model that fits the data acceptably at the outset of a modelling exercise is unlikely; a little (or a lot!) of effort is usually required to be expended on tuning and debugging the algorithm to achieve acceptable performance.

In the case of non-IID time series data, we also have the dilemma of the amount of data to use in the training of a predictive model. Given the non-stationarity of asset prices, if we use too much data, we run the risk of training our  model on data that is no longer relevant. If we use too little data, we run the risk of building an under-fit model. This begs the question: Is there an ideal amount of data to include in machine learning models for financial prediction? I don’t know, but I doubt the answer is clear cut since we never know when the underlying process is about to undergo significant change. I hypothesise that it makes sense to use the minimum amount of data that leads to acceptable model performance, and testing this is the subject of this post.

How Much Data?

In classical data science, model performance generally improves as the amount of training data is increased. However, as mentioned above, due to the non-IID nature of the data we use in finance, this happy assumption is not necessarily applicable. My theory is that using too much data (that is, using a training window that extends far into the past) is actually detrimental to model performance.

In order to explore this idea, I decided to build a model based on previous asset returns and measures of volatility. The volatility measure that I used is the 5-period Average True Range (ATR) minus the 20-period ATR normalized over the last 50 periods. The data used is the EUR/USD daily exchange rate sampled at 9:00am GMT between 2006 and 2016.

The model used the previous three values of the returns and volatility series as the input features and the next day’s market direction as the target feature. I trained a simple two-class logistic regression model using R’s glm function with a time-series cross validation approach. This approach involves training the model on a window of data and predicting the outcome of the next period, then shifting the training window forward in time by one period. The model is then retrained on the new window and the next period’s outcome predicted. This process is repeated along the length of the time series. The cross-validated performance of the model is simply the performance of the next-day predictions using some suitable performance measure. I recorded the profit factor and sharpe ratio of the model’s predictions. I used class probabilities to determine the positions for the next day as follows:

if P_{up} >= 0.55, go long at open

if P_{down} >= 0.55, go short at open

if 0.45 < P_{up} < 0.55 (equivalent to 0.45 < P_{down} < 0.55), remain flat

where P_{up} and P_{down} are the calculated probabilities for the next day’s market direction to be positive and negative respectively.

Positions were liquidated at the close.

In order to investigate the effects of the size of the data window, I varied its size between 15 and 1,600 days and recorded the cross-validated performance for each case. I also recorded the average in-sample performance on each of the training windows. Slicing up the data so that the various cross-validation samples were consistent across window lengths took some effort, but this wrangling was made simpler using Max Kuhn (to whom I once again tip my hat) and his caret package.

The results are presented below.

IS-CV Profit Factor


We can see that for the smallest window lengths, the in-sample performance greatly exceeds the cross-validated performance. In other words, when we use very little data, the model fits the training data well, but fails to generalize out of sample. It has a variance problem, which is what we would expect.

Then things get interesting. As we add slightly more data in the form of a longer training window, the in-sample performance decreases, but the cross-validated performance increases, very quickly rising to meet the in-sample performance. In-sample and cross-validated performance is very similar for a range of window lengths between 25 and 75 days. This is an important result, because when the cross-validated performance approximates the in-sample performance, we can conclude that the model is capturing the underlying signal and is therefore likely to generalise well. Encouragingly, this performance is reasonably robust in the approximate window range 25-75 days. If we had only one data point showing reasonable cross-validated performance, I wouldn’t trust that this wasn’t due to randomness. The existence of a region of reasonable performance implies that we may have a degree of confidence in the results.

As we add yet more data to our training window, we can see that the in-sample performance continues to deteriorate, eventually reaching a lower limit, and that the cross-validated performance likewise continues to decline, with a notable exception around 500 days. This suggests that as we increase the training window length, the model develops a bias problem and underfits the data.

These results are perhaps confounded by the fact that the optimal window length may be a characteristic of this particular market and the particular 10-year period used in this experiment. Actually, I feel this is quite likely. I haven’t run this experiment on other markets or time periods yet, but I strongly suspect that each market will exhibit different optimal window lengths, and that these will probably themselves vary with time. Notwithstanding this, it appears that we can at least conclude that in finance, more data is not necessarily better.

Equity Curves

I know how much algorithmic traders like to see an equity curve, so here is the model performance using a variety of selected window lengths, as well as the buy and hold equity curve of the underlying. Transactions costs are not included.


In this case, the absolute performance is nothing spectacular*. However, it demonstrates the differences in the quality of the predictions obtained using different window lengths for training the models. We can clearly see that more is not necessarily better, at least for this particular period of time.

Performance as a Function of Class Probability Threshold

It is also interesting to investigate how performance varies across the different windows lengths as a function of the class probability threshold used in the trading decisions. Here is a heatmap of the model’s Sharpe ratio for various window lengths and class probability thresholds.


We can see a fairly obvious region of higher Sharpe ratios for lower window lengths and generally increasing class probability threshold. The region of the higher Sharpe ratios for longer window lengths and higher class probabilities (the upper right corner) is actually slightly misleading, since the number of trades taken for these model configurations is vanishingly small. However, we can see that when those trades do occur, they tend to be of a higher quality.

Finally, here are several equity curves for a window length of 30 days and various class probability thresholds.



This post investigated the effects of varying the length of the training window on the performance of a simple logistic regression model for predicting the next-day direction of the EUR/USD exchange rate. Results indicated that more data does not necessarily lead to a better predictive model. In fact, there may be a case for using a relatively small window of training data to force the model to continuously re-learn and adapt to the most immediate market conditions. There appears to be a trade-off to contend with, with very small windows exhibiting vast differences between performance on the training set and performance on out of sample data, and very large windows performing poorly both in-sample and out-of-sample.

While absolute performance of the model was nothing to get excited about, the model used here was a very simple logistic regression classifier and minimal effort was spent on feature engineering. This suggests that the outcomes of this research could potentially be used in conjunction with more sophisticated algorithms and features to build a model with acceptable performance. This will be the subject of future posts.

The axiom he who has the most data wins is widely applicable in many data science applications. This doesn’t appear to be the case when it comes to building predictive models for the financial markets. Rather, the research presented here suggests that the development and engineering of the model itself may play a far larger role in its out of sample performance. This implies that model performance is more a function of the skill of the developer than on the ability to obtain as much data as possible. I find that to be a very satisfying conclusion.

Source Code

Here’s some source code if you are interested in reproducing my results. Warning: it is slightly hacky and takes a long time to run if you store all the in-sample performance data! By default I have commented out that part of the code.

*Of course, building a production trading model is not the point of the exercise. Apologies for pointing this out; I know most of you already understand this, but I invariably get emails after every post from people questioning the performance of the ‘trading algorithms’ I post on my blog. Just to be clear, I am not posting trading algorithms!! I am sharing my research. Performance on market data, particularly relative performance, is a quick and easy way to interpret the results of this research. I don’t intend for anyone (myself included) to use the simple logistic regression model presented here in a production environment. However, I do intend to use the concepts presented in this post to improve my existing models or build entirely new ones. There is more than enough information in this post for you to do the same, if you so desired.