In the first three posts of this mini-series on pairs trading with Zorro and R, we:
- Implemented a Kalman filter in R
- Implemented a simple pairs trading algorithm in Zorro
- Connected Zorro and R and exchanged data between the two platforms
In this fourth and final post, we’re going to put it all together and develop a pairs trading script that uses Zorro for all the simulation aspects (data handling, position tracking, performance reporting and the like) and our Kalman implementation for updating the hedge ratio in real-time.
You can download the exact script used in this post for free down at the very bottom. Let’s go!
Step 1: Encapsulate our Kalman routine in a function
Encapsulating our Kalman routine in a function makes it easy to call from our Zorro script – it reduces the call to a single line of code.
Save the following R script, which implements the iterative Kalman operations using data sent from Zorro, in your Zorro strategy folder:
###### KALMAN FILTER ####### delta <- 0.0001 Vw <- delta/(1-delta)*diag(2) Ve <- 0.01 R <- matrix(rep(0, 4), nrow=2) P <- matrix(rep(0, 4), nrow=2) kalman_iterator <- function(y, x, beta) { beta <- matrix(c(beta, 0), nrow=1) x <- matrix(c(x, 1), nrow=1) R <<- P + Vw # state cov prediction y_est <- x[1, ] %*% beta[1, ] # measurement prediction Q <- x[1, ] %*% R %*% x[1, ] + Ve # measurement variance prediction # error between observation of y and prediction e <- y - y_est K <- R %*% t(x) / drop(Q) # Kalman gain # state update beta <- beta[1, ] + K * e[1, ] P <<- R - K %*% x[1, ] %*% R return(list(beta[1], e, Q)) }
Recall that this implementation of the Kalman filter is almost parameterless. There are however two parameters that impact the speed at which the hedge ratio is updated by the Kalman algorithm, delta in line 3 and Ve in line 5.
You can experiment with these parameters, but note that changes here will generally require changes in the Zorro script, such as the spacing between trade levels (more on this below).
Experimentation is a good thing (it’s useful to understand how these parameters impact the algorithm), but a nice, stable pair trade should be relatively robust to changes in these parameters. A pair that depends on just the right values of these parameters is one I’d think twice about trading.
Having said that, a sensible use of these parameters is to adjust the trade frequency of your pairs in line with transaction costs and risk management approach (not to optimise the strategy’s backtested performance).
Step 2: Set up a Zorro pair trading script to exchange data with R
Here’s our simple pairs trading script modified to call the Kalman iterator function to update the hedge ratio. To experiment with this Zorro script you’ll need:
- an Alpha Vantage API key (we load price history directly from Alpha Vantage)
- to set up trading conditions in a Zorro assets list (although if you don’t want to model costs, you don’t need to do this)
/* KALMAN PAIRS TRADING */ #include <r.h> #define Asset1 "GDX" #define Asset2 "GLD" #define MaxTrades 1 #define Spacing 1 // #define COSTS int Portfolio_Units = 1000; //units of the portfolio to buy/sell (more --> better fidelity to hedge ratio) var calculate_spread(var hedge_ratio) { var spread = 0; asset(Asset1); #ifndef COSTS Spread = Commission = Slippage = 0; #endif spread += priceClose(); #ifndef COSTS Spread = Commission = Slippage = 0; #endif asset(Asset2); spread -= hedge_ratio*priceClose(); return spread; } function run() { set(PLOTNOW); setf(PlotMode, PL_FINE); StartDate = 20060525; EndDate = 2019; BarPeriod = 1440; LookBack = 1; MaxLong = MaxShort = MaxTrades; // --------------------------------------- // Startup and data loading // --------------------------------------- if(is(INITRUN)) { // start R and source the kalman iterator function if(!Rstart("kalman.R", 2)) { print("Error - can't start R session!"); quit(); } // load data from Alpha Vantage string Name; int n = 0; while(Name = loop(Asset1, Asset2)) { assetHistory(Name, FROM_AV); n++; } } // --------------------------------------- // calculate hedge ratio and trade levels // --------------------------------------- asset(Asset1); #ifndef COSTS Spread = Commission = Slippage = 0; #endif vars prices1 = series(priceClose()); asset(Asset2); #ifndef COSTS Spread = Commission = Slippage = 0; #endif vars prices2 = series(priceClose()); static var beta; if(is(INITRUN)) beta = 0; // use kalman iterator to calculate paramters Rset("y", prices1[0]); Rset("x", prices2[0]); Rset("beta", beta); Rx("kalman <- kalman_iterator(y, x, beta)"); beta = Rd("kalman[[1]][1]"); vars e = series(Rd("kalman[[2]]")); var Q = Rd("kalman[[3]]"); // set up trade levels var Levels[MaxTrades]; int i; for(i=0; i<MaxTrades; i++) { Levels[i] = (i+1)*Spacing*sqrt(Q); } // --------------------------------------- // trade logic // --------------------------------------- // enter positions at defined levels for(i=0; i<MaxTrades; i++) { if(crossUnder(e, -Levels[i])) { asset(Asset1); Lots = Portfolio_Units; enterLong(); asset(Asset2); Lots = Portfolio_Units * beta; enterShort(); } if(crossOver(e, Levels[i])) { asset(Asset1); Lots = Portfolio_Units; enterShort(); asset(Asset2); Lots = Portfolio_Units * beta; enterLong(); } } // exit positions at defined levels for(i=1; i<MaxTrades-1; i++) { if(crossOver(e, -Levels[i])) { asset(Asset1); exitLong(0, 0, Portfolio_Units); asset(Asset2); exitShort(0, 0, Portfolio_Units * beta); } if(crossUnder(e, Levels[1])) { asset(Asset1); exitShort(0, 0, Portfolio_Units); asset(Asset2); exitLong(0, 0, Portfolio_Units * beta); } } // --------------------------------------- // plots // --------------------------------------- plot("beta", beta, NEW, PURPLE); if(abs(e[0]) < 20) { plot("error", e, NEW, BLUE); int i; for(i=0; i<MaxTrades; i++) { plot(strf("#level_%d", i), Levels[i], 0, BLACK); plot(strf("#neglevel_%d", i), -Levels[i], 0, BLACK); } } }
Like in our original vectorised backtest, this strategy is always in the market, simply entering a long position when the prediction error of the Kalman filter drops below its minus one standard deviation level and holding it until the prediction error crosses above its plus one standard deviation level, at which point the trade is reversed and a short position held.
This is not the optimal way to trade a spread, so we’ve left the door open to trade at multiple levels (line 9) with a user-specified spacing between levels (line 10).
But before we get to that, there’s an important box we need to tick…
Step 3: Reproduce original results
Before we go further, we’ll aim to reproduce the results we got in the vectorised backtest we wrote in R way back in the first post of this series. That way, we can validate that our Zorro setup is working as expected.
This is an important (and easily overlooked) step because we’ll surely tinker with the strategy implementation (Zorro is really useful for efficiently doing that sort of experimentation), and we need to have confidence in our setup before we make any changes, do further research, and make decisions based on what we find.
If you’ve ever had to rewind a whole bunch of research because of a faulty implementation at the outset, you know what I’m talking about….
We’d expect some differences since Zorro provides an event-driven sequential backtester with very different assumptions to my hacky vectorised backtest. But we should see consistency in the hedge ratio, the positions taken, and the shape of the equity curve.
Here’s the Zorro output when we trade at one standard deviation of the prediction errors:
The hedge ratio, prediction errors, positions and equity curve shape all look very similar to the original vectorised R version.
We also ran a more aggressive version through our vectorised backtester, which traded at half a standard deviation of the prediction errors. Here’s what that looks like in Zorro (simply change line 10 to #define Spacing 0.5
):
Again, virtually identical to the output of our vectorised backtest.
I’m calling that a win. Time to move on to some fun stuff.
Step 4: Iterate on the algorithm design
There are a bunch of things we can try with our pairs trading implementation. A few of them include:
- Exiting positions when the prediction error crosses zero
- Limiting the hold time of individual positions (that is, closing out early if the spread hasn’t converged fast enough)
- Entering at multiple levels
- Using more or less aggressive entry level spacing
Here’s an example of trading quite aggressively every 0.25 standard deviations of the prediction error, up to a maximum of eight levels:
Of course, when you trade like this you’re going to pay a ton in fees. But it gives you a taste of the sorts of things you can experiment with using this framework.
Conclusion
This concludes our mini-series on pairs trading with Zorro and R via the Kalman filter. We saw how you might:
- Implement the Kalman filter in R
- Implement a pairs trading algorithm in Zorro
- Make Zorro and R talk to one another
- Put it all together in an integrated pairs trading strategy
We’d love to know what you thought of the series in the comments. In particular, can you suggest any pairs you’d like to see us test? Can you suggest any improvements to the pairs trading algorithm itself? Are there any other approaches you’d like us to implement or test?
Thanks for reading!
[thrive_leads id=’12463′]
2 thoughts on “Kalman Filter Pairs Trading with Zorro and R”