# Kalman Filter Pairs Trading with Zorro and R: Putting it all together

Posted on Oct 16, 2019 by
518 Views

In the first three posts of this mini-series on pairs trading with Zorro and R, we:

In this fourth and final post, we’re going to put it all together and develop a pairs trading script that uses Zorro for all the simulation aspects (data handling, position tracking, performance reporting and the like) and our Kalman implementation for updating the hedge ratio in real-time.

## Step 1: Encapsulate our Kalman routine in a function

Encapsulating our Kalman routine in a function makes it easy to call from our Zorro script – it reduces the call to a single line of code.

Save the following R script, which implements the iterative Kalman operations using data sent from Zorro, in your Zorro strategy folder:

###### KALMAN FILTER #######

delta <- 0.0001
Vw <- delta/(1-delta)*diag(2)
Ve <- 0.01
R <- matrix(rep(0, 4), nrow=2)
P <- matrix(rep(0, 4), nrow=2)

kalman_iterator <- function(y, x, beta) {
beta <- matrix(c(beta, 0), nrow=1)
x <- matrix(c(x, 1), nrow=1)
R <<- P + Vw # state cov prediction

y_est <- x[1, ] %*% beta[1, ] # measurement prediction
Q <- x[1, ] %*% R %*% x[1, ] + Ve # measurement variance prediction

# error between observation of y and prediction
e <- y - y_est
K <- R %*% t(x) / drop(Q) # Kalman gain

# state update
beta <- beta[1, ] + K * e[1, ]
P <<- R - K %*% x[1, ] %*% R

return(list(beta[1], e, Q))
}

Recall that this implementation of the Kalman filter is almost parameterless. There are however two parameters that impact the speed at which the hedge ratio is updated by the Kalman algorithm, delta  in line 3 and Ve in line 5.

You can experiment with these parameters, but note that changes here will generally require changes in the Zorro script, such as the spacing between trade levels (more on this below).

Experimentation is a good thing (it’s useful to understand how these parameters impact the algorithm), but a nice, stable pair trade should be relatively robust to changes in these parameters. A pair that depends on just the right values of these parameters is one I’d think twice about trading.

Having said that, a sensible use of these parameters is to adjust the trade frequency of your pairs in line with transaction costs and risk management approach (not to optimise the strategy’s backtested performance).

## Step 2: Set up a Zorro pair trading script to exchange data with R

Here’s our simple pairs trading script modified to call the Kalman iterator function to update the hedge ratio. To experiment with this Zorro script you’ll need:

• an Alpha Vantage API key (we load price history directly from Alpha Vantage)
• to set up trading conditions in a Zorro assets list (although if you don’t want to model costs, you don’t need to do this)

/* KALMAN PAIRS TRADING

*/

#include <r.h>

#define Asset1 "GDX"
#define Asset2 "GLD"
#define Spacing 1
// #define COSTS

int Portfolio_Units = 1000; //units of the portfolio to buy/sell (more --> better fidelity to hedge ratio)

{
asset(Asset1);
#ifndef COSTS
Spread = Commission = Slippage = 0;
#endif
#ifndef COSTS
Spread = Commission = Slippage = 0;
#endif

}

function run()
{
set(PLOTNOW);
setf(PlotMode, PL_FINE);
StartDate = 20060525;
EndDate = 2019;
BarPeriod = 1440;
LookBack = 1;

// ---------------------------------------
// ---------------------------------------
if(is(INITRUN))
{
// start R and source the kalman iterator function
if(!Rstart("kalman.R", 2))
{
print("Error - can't start R session!");
quit();
}

// load data from Alpha Vantage
string Name;
int n = 0;
while(Name = loop(Asset1, Asset2))
{
assetHistory(Name, FROM_AV);
n++;
}
}

// ---------------------------------------
// calculate hedge ratio and trade levels
// ---------------------------------------
asset(Asset1);
#ifndef COSTS
Spread = Commission = Slippage = 0;
#endif
vars prices1 = series(priceClose());

asset(Asset2);
#ifndef COSTS
Spread = Commission = Slippage = 0;
#endif
vars prices2 = series(priceClose());

static var beta;
if(is(INITRUN)) beta = 0;

// use kalman iterator to calculate paramters
Rset("y", prices1[0]);
Rset("x", prices2[0]);
Rset("beta", beta);
Rx("kalman <- kalman_iterator(y, x, beta)");
beta = Rd("kalman[[1]][1]");
vars e = series(Rd("kalman[[2]]"));
var Q = Rd("kalman[[3]]");

int i;
{
Levels[i] = (i+1)*Spacing*sqrt(Q);
}

// ---------------------------------------
// ---------------------------------------

// enter positions at defined levels
{
if(crossUnder(e, -Levels[i]))
{
asset(Asset1);
Lots = Portfolio_Units;
enterLong();
asset(Asset2);
Lots = Portfolio_Units * beta;
enterShort();
}
if(crossOver(e, Levels[i]))
{
asset(Asset1);
Lots = Portfolio_Units;
enterShort();
asset(Asset2);
Lots = Portfolio_Units * beta;
enterLong();
}
}

// exit positions at defined levels
{
if(crossOver(e, -Levels[i]))
{
asset(Asset1);
exitLong(0, 0, Portfolio_Units);
asset(Asset2);
exitShort(0, 0, Portfolio_Units * beta);
}
if(crossUnder(e, Levels[1]))
{
asset(Asset1);
exitShort(0, 0, Portfolio_Units);
asset(Asset2);
exitLong(0, 0, Portfolio_Units * beta);
}
}

// ---------------------------------------
// plots
// ---------------------------------------
plot("beta", beta, NEW, PURPLE);
if(abs(e[0]) < 20)
{
plot("error", e, NEW, BLUE);
int i;
{
plot(strf("#level_%d", i), Levels[i], 0, BLACK);
plot(strf("#neglevel_%d", i), -Levels[i], 0, BLACK);
}
}
}

Like in our original vectorised backtest, this strategy is always in the market, simply entering a long position when the prediction error of the Kalman filter drops below its minus one standard deviation level and holding it until the prediction error crosses above its plus one standard deviation level, at which point the trade is reversed and a short position held.

This is not the optimal way to trade a spread, so we’ve left the door open to trade at multiple levels (line 9) with a user-specified spacing between levels (line 10).

But before we get to that, there’s an important box we need to tick…

## Step 3: Reproduce original results

Before we go further, we’ll aim to reproduce the results we got in the vectorised backtest we wrote in R way back in the first post of this series. That way, we can validate that our Zorro setup is working as expected.

This is an important (and easily overlooked) step because we’ll surely tinker with the strategy implementation (Zorro is really useful for efficiently doing that sort of experimentation), and we need to have confidence in our setup before we make any changes, do further research, and make decisions based on what we find.

If you’ve ever had to rewind a whole bunch of research because of a faulty implementation at the outset, you know what I’m talking about….

We’d expect some differences since Zorro provides an event-driven sequential backtester with very different assumptions to my hacky vectorised backtest. But we should see consistency in the hedge ratio, the positions taken, and the shape of the equity curve.

Here’s the Zorro output when we trade at one standard deviation of the prediction errors:

The hedge ratio, prediction errors, positions and equity curve shape all look very similar to the original vectorised R version.

We also ran a more aggressive version through our vectorised backtester, which traded at half a standard deviation of the prediction errors. Here’s what that looks like in Zorro (simply change line 10 to #define Spacing 0.5):

Again, virtually identical to the output of our vectorised backtest.

I’m calling that a win. Time to move on to some fun stuff.

## Step 4: Iterate on the algorithm design

There are a bunch of things we can try with our pairs trading implementation. A few of them include:

• Exiting positions when the prediction error crosses zero
• Limiting the hold time of individual positions (that is, closing out early if the spread hasn’t converged fast enough)
• Entering at multiple levels
• Using more or less aggressive entry level spacing

Here’s an example of trading quite aggressively every 0.25 standard deviations of the prediction error, up to a maximum of eight levels:

Of course, when you trade like this you’re going to pay a ton in fees. But it gives you a taste of the sorts of things you can experiment with using this framework.

## Conclusion

This concludes our mini-series on pairs trading with Zorro and R via the Kalman filter. We saw how you might:

• Implement the Kalman filter in R
• Implement a pairs trading algorithm in Zorro
• Make Zorro and R talk to one another
• Put it all together in an integrated pairs trading strategy

We’d love to know what you thought of the series in the comments. In particular, can you suggest any pairs you’d like to see us test? Can you suggest any improvements to the pairs trading algorithm itself? Are there any other approaches you’d like us to implement or test?

## Get the Code

Get the pairs trading script we used in this blog post!