The post A Vector Autoregression Trading Model appeared first on Robot Wealth.

]]>However, if we consider that the VAR framework finds application in the modelling of correlated time series, the implication being that correlation implies a level of forecasting utility, then perhaps we could model a group of related financial instruments and make predictions that we can translate into trading decisions?

So we’ll give that a try. But first, a brief overview of VAR models.

The univariate autoregression (AR) is a model of a time series as a function of past values of itself:

\(Y_t = \alpha + \beta_1 Y_{t-1}+ \beta_2 Y_{t-2} \)

That’s an AR(2) model because it uses two previous values in the time series \(Y\) to estimate the next value. The name of the game is figuring out how many previous values to use, and estimating the coefficients (the \(\beta\)s) and the intercept (\(\alpha\)).

A vector autoregression (VAR) is an extension of this idea. It models multiple time series that affect one another together, as a system. It specifically allows for bi-directional relationships such as feedback loops, where say an increase in variable \(X\) may predict an increase in variable \(Y\), but equally an increase in variable \(Y\) may predict an increase in variable \(X\).

Here’s a VAR(1) model of two time series, \(Y_1\) and \(Y_2\):

\(Y_{1,t} = \alpha_1 + \beta_{11,1} Y_{1,t-1} + \beta_{12, 1} Y_{2, t-1} + \epsilon_{1, t}\)

\(Y_{2,t} = \alpha_2 + \beta_{21,1} Y_{1,t-1} + \beta_{22, 1} Y_{2, t-1} + \epsilon_{2, t}\)

The model uses a single lag of each time series to predict the next values of both time series. It requires the estimation of four coefficients and two intercepts.

Just looking at the single lag case, you get a sense that these models have lots of parameters. Which of course triggers all the usual alarm bells around overfitting. In fact, if we have \(N\) time series and \(p\) lags in a VAR model, we must estimate \(N + pN^2\) parameters!

Standard practice in econometrics is to use an information criterion. It’s questionable how useful that would be in modelling financial asset returns, and in my view it makes sense to stick with a single lag unless you have a compelling reason to do otherwise.

If you must, lean towards the Bayesian information criterion (BIC), which introduces a penalty term for the number of parameters in the model (the Aikake information criterion does too, but the BIC’s penalty is bigger).

Say we have a basket of stocks that we believe to be related in some way. Also, suppose that their relationship implies a degree of predictability among basket constituents. If we could forecast the relative returns between basket constituents, we might have the makings of a trading strategy.

In this post, we’re going to focus on solving the forecasting problem using the VAR framework. In reality, the problem of universe selection (identifying a suitable basket) is a much bigger problem, but we’re going to assume this problem is solved for the sake of the exercise, and that we have a basket of stocks worthy of considering under our VAR framework.

In fact, this

something we’ve solved – in our Machine Learning and Big Data Bootcamp. Go here to find out more and sign up to the wait list for the next edition of Bootcamp.is

In this example, our group of stocks appeared in the network model of stock relationships that we built using the Graphical Lasso. This is only a single input into the universe selection model that we trade with, but it will do fine for demonstrating this VAR model. We’ll take the stocks in the little purple cluster consisting of residential construction stocks:

# Extract basket prices tickers <- c('KBH', 'LEN', 'PHM', 'DHI', 'TOL', 'MTH', 'MDC')

This group is a fairly arbitrary choice – the basket is small enough that we can explore VAR models efficiently but other than that there’s nothing particularly special about it. Other than the fact that the Graphical Lasso identified relationships among the group’s stocks.

You can get historical prices and volumes for these tickers via `tidyquant::tq_get`

, which wraps `quantmod::getSymbols`

:

### VAR MODEL STRATEGY ### library(tidyquant) library(tidyverse) # Load basket prices tickers <- c('KBH', 'LEN', 'PHM', 'DHI', 'TOL', 'MTH', 'MDC') basket_prices <- tq_get(tickers, get='stock.prices', from='2000-01-01', to = '2020-01-01') %>% rename(ticker = symbol) basket_prices %>% ggplot(aes(x = date, y = adjusted)) + geom_line(aes(color = ticker))

Plotting the price series:

We’re going to work with returns for our VAR model, so let’s make those now:

# Calculate returns ret <- basket_prices %>% group_by(ticker) %>% filter(date >= '2000-01-01', date < '2020-01-01') %>% tq_transmute(select=adjusted, mutate_fun=periodReturn, period="daily", type="log") # make wide dataframe of returns ret_wide <- ret %>% spread(key=ticker, value=daily.returns)

Now we’re ready to build a VAR model. The `vars`

package will do all the heavy lifting for us. So let’s load that, and then construct a VAR(1) model on the first 250 days in our historical data set:

# VAR model library(vars) wdw <- 250 var <- VAR(ret_wide[1:wdw, -1], p=1, type="const")

That’s it! Pretty amazing that we can do so much with so little code.

To get a sense for how much work has just been done under the hood, run the command `summary(var)`

and you’ll get a whole lot of output about the model. You’ll see model estimation statistics for each coefficient for each stock in our basket, including insight on what R thought were the significant variables in each equation, as well as information on the residuals of each model.

So now that we’ve got our model, how can we use it to forecast prices? How might we use that information to make trading decisions?

We can get the single step ahead prediction from the model by calling `predict`

, which returns a list with a whole bunch of stuff in addition to the actual forecasts. To extract the forecasts into their own object, we need to do some typically funky R list manipulation:

# step ahead predictions p <- predict(var, n.ahead = 1) fcsts <- t(do.call(rbind, lapply(p$fcst, "[", 1)))

The `fcsts`

object holds a return forecast for each stock in our basket. It looks like this:

> fcsts DHI KBH LEN MDC MTH PHM TOL [1,] -0.007694804 0.01069015 0.008959145 0.008029125 0.0200517 0.01437472 -0.001169301

That’s cool. But how does it compare to the actual return values for the next period?

# predictions vs actuals ret_wide[wdw+1, -1] - fcsts # OUTPUT: # DHI KBH LEN MDC MTH PHM TOL # 1 0.04469033 0.03376181 0.04917655 0.007311673 -0.02175964 0.01846445 -0.00187468

Not amazing by the looks of that!

How did we go predicting the correct sign?

# sign of predictions vs actuals sign(ret_wide[wdw+1, -1]/fcsts) # OUTPUT # DHI KBH LEN MDC MTH PHM TOL # 1 -1 1 1 1 -1 1 1

Here, a positive number indicates that the return sign was predicted correctly; a negative number indicates an incorrect prediction.

We’ve done a little better here. The model predicted six of eight return signs correctly.

Let’s look at whether we were able to correctly predict the rank of the next period’s returns:

# rank of predictions vs actuals rank(t(ret_wide[wdw+1, -1])) rank(fcsts) # OUTPUT # > rank(t(ret_wide[wdw+1, -1])) # 5 6 7 3 2 4 1 # > rank(fcsts) # 1 5 4 3 7 6 2

Not terrible. In fact, six of eight are accurate enough that if we traded these ranks long-short we’d have done OK on this prediction.

This is all well and good, but so far we’ve only looked at a single prediction. That tells us next to nothing about how useful this approach might be in a trading strategy. Time for some backtesting.

There are plenty of ways we could act on the predictions of our VAR model. The most familiar of what we looked at above would be trading long-short on the basis of the predicted ranks of our forecast returns.

Here we’re going to do something slightly different that I read about in one of Ernie Chan’s books: we’ll dollar weight assets based on the normalised difference of each asset’s forecast return, and the mean return of the basket:

# dollar weight assets based on normalised difference to forecast mean basket return dollar_wgts <- (fcsts - mean(fcsts))/sum(abs(fcsts - mean(fcsts))) # h.t. Ernie Chan dollar_wgts # OUTPUT # DHI KBH LEN MDC MTH PHM TOL # -0.3177601 0.06405476 0.02810558 0.008791111 0.2584734 0.1405751 -0.1822399

Then, we can calculate a portfolio return by multiplying the actual returns by the dollar weights and summing:

port_ret <- sum(dollar_wgts * ret_wide[wdw+1, -1]) port_ret # OUTPUT # -0.002409894

To backtest our VAR forecasts, we’ll set up a rolling window of 500 days on which to estimate the model’s parameters, making a single day-ahead prediction before rolling the window forward by a day and repeating the process, and storing the dollar weights of each asset in a list:

# rolling VAR estimation and prediction wdw <- 500 dollar_wgts <- vector('list', nrow(ret_wide)-wdw) for(i in (wdw):(nrow(ret_wide)-1)) { # don't need to get a forecast for the final data point for backtesting purposes var <- ret_wide %>% slice((i-wdw):(i-1)) %>% dplyr::select(-date) %>% VAR(p=1, type="const") p <- var %>% predict(n.ahead = 1) fcsts <- t(do.call(rbind, lapply(p$fcst, "[", 1))) dollar_wgts[[i]] <- (fcsts - mean(fcsts))/sum(abs(fcsts - mean(fcsts))) }

Then it’s a matter of extracting the weights from the list, calculating a time series of the cumulative portfolio return and plotting it:

wgts <- bind_rows(lapply(dollar_wgts, as.data.frame)) port_ret <- data.frame('date' = ret_wide[(wdw+1):(nrow(ret_wide)), 'date'], 'ret' = rowSums(wgts * ret_wide[(wdw+1):(nrow(ret_wide)), -1])) # first wgt is from idx==wdw, assign return from wdw:(wdw+1) port_ret <- port_ret %>% mutate('cum_ret' = cumsum(ret)) port_ret %>% ggplot(aes(x = date, y = cum_ret)) + geom_line() + labs( title = 'Cumulative Returns to VAR Trading Model', x = 'Date', y = 'Return' )

Here’s the output:

Costs aren’t considered on this equity curve, and they’d add up to plenty as we’re rebalancing a basket of seven assets on a daily basis. Also bear in mind that the in-sample period for discovering the basket via the Graphical Lasso was 2010-2019 inclusive.

Finally, I wanted to get an idea of the stability of performance with respect to the length of the estimation window. I simply looped over that backtest for different window lengths and stored the results in a list:

# check stability of window length parameter wdws <- c(250, 500, 750, 1000) ports <- vector("list") for (wdw in wdws) { dollar_wgts <- vector('list', nrow(ret_wide)-wdw) for(i in (wdw):(nrow(ret_wide)-1)) { # don't need to get a forecast for the final data point for backtesting purposes var <- ret_wide %>% slice((i-wdw):(i-1)) %>% dplyr::select(-date) %>% VAR(p=1, type="const") p <- var %>% predict(n.ahead = 1) fcsts <- t(do.call(rbind, lapply(p$fcst, "[", 1))) dollar_wgts[[i]] <- (fcsts - mean(fcsts))/sum(abs(fcsts - mean(fcsts))) # h.t. Ernie Chan } wgts <- bind_rows(lapply(dollar_wgts, as.data.frame)) port_ret <- data.frame('date' = ret_wide[(wdw+1):(nrow(ret_wide)), 'date'], 'ret' = rowSums(wgts * ret_wide[(wdw+1):(nrow(ret_wide)), -1])) # first wgt is from idx==wdw, assign return from wdw:(wdw+1) name <- paste0(wdw) port_ret <- port_ret %>% mutate(!!name := cumsum(ret)) ports <- c(ports, list(port_ret)) }

With a bit of `tidy`

R manipulation, we can plot the equity curves for the different window lengths:

# plot returns by window length ports %>% map(~ (.x %>% dplyr::select(-ret))) %>% # drop ret column from each dataframe in list reduce(left_join, by="date") %>% pivot_longer(-date, names_to = "window", values_to = "return") %>% ggplot(aes(x = date, y = return)) + geom_line(aes(color = window)) + labs( title = 'Return to VAR models of different data windows', x = "Date", y = "Return" )

Here’s the output:

Thus concludes our whirlwind tour of VAR models and their potential use in trading strategies.

In my experience, the problem of * finding a basket of related assets* that can be exploited in convergence trades like this one is a bigger problem than coming up with the actual trading strategy itself.

And to be fair, there are simpler ways to go about the actual trading than using a VAR model.

However, the VAR model does demonstrate some utility here, and I think it could be used as one of several inputs into a larger ensemble of predictions, rather than a standalone trading strategy.

*What do you think? Is the VAR framework attractive to you as a trading tool? Tell me what you think in the comments.*

The post A Vector Autoregression Trading Model appeared first on Robot Wealth.

]]>The post The Graphical Lasso and its Financial Applications appeared first on Robot Wealth.

]]>

Are you suggesting that Friedman and his titans of statistical learning somehow caused the GFC by publishing their Graphical Lasso algorithm?

Not at all. I’m just setting you up to demonstrate the fallacy of mistaking *correlation* with *causation *(thanks for playing along).

Seeing patterns where there are none is part of what it means to be human. Of course, Friedman and his gang of statisticians didn’t cause the GFC. But they did help us deal with our psychological flaws by providing us with a powerful tool for detecting spurious correlations.

Their tool allows one to figure out if variables are correlated with one another *directly*, or whether any measured connection is merely due to a *common connection to something else*.

Consider the two stocks ABB, a multinational industrial automation company, and PUK, a multinational life insurance company. Over the period 2015 to 2018, these companies’ returns had a correlation of around 0.6 – which suggests a significant relationship.

Now consider the classical pairs trade, where we bet on the convergence of related financial instruments following a dislocation in their prices. How would you feel about trading ABB and PUK as a pair? Would you be willing to bet that if they diverged, they’d come back together thanks to their significant correlation over the sample period?

Of course, the answer is *no*. You intuitively know that this isn’t a sensible bet. You know that these stocks aren’t moving together because they’re related to one another, but because they’re huge firms *with similar beta to the broader market*.

Said differently, even though they’re correlated, they don’t *explain* one another’s returns. A third variable, the S&P500 (to which they both have a strong relationship), is the main driver of their correlation.

ABB and PUK are therefore *conditionally independent given the S&P500. *

*Conditionally independent *means that there is no direct relationship between ABB and PUK when you account for the effects of other variables.

This is a common feature of correlation estimates from stock returns data. Such estimates are often misleading due to spurious correlations and the existence of confounding variables related to both returns series.

Given a series of stock returns:

- The Graphical Lasso can be used to estimate an inverse covariance matrix.
- The elements of the inverse covariance matrix are proportional to the partial correlation coefficients between a pair of stocks.
- The partial correlation of two variables is a measure of their relationship given all the other variables in the data set.

**That is, the Graphical Lasso can help us remove effects such as market beta and recover real, direct relationships between stocks.**

To be fair, you don’t really need a fancy machine-learning algorithm to tell you that industrial automation companies aren’t directly related to life insurance providers. Intuition and common sense will do.

However, Graphical Lasso can still be useful:

- It can provide
*validation*of relationships determined through other means (including intuition) - It can highlight
*hidden relationships*that don’t necessarily surface through other means - It can
*quantify*the strength of a relationship, since the magnitude of the partial correlation is informative - It can help us construct visualisation tools, such as the interactive network graph we’ll build shortly, which help us reason about and understand a large universe stocks

Further, it makes intuitive sense that in a large universe, most stocks would be conditionally independent. Therefore, we’d favour an inverse covariance matrix that highlights strong relationships and zeroes correlations that are dependent on a third variable. The Graphical Lasso algorithm allows us to refine this sparsity condition by tuning it’s only parameter. More on this shortly.

There are some great resources that explore in excruciating detail the math behind the Graphical Lasso and the inverse covariance matrix.

There’s little point repeating that material here, but I do think there’s value in clarifying something that tripped me up. When I first used this tool, I assumed that the terms in the inverse covariance matrix were equivalent to the partial correlation between the two corresponding variables.

*This is wrong.*

The inverse covariance matrix is *proportional* to the partial correlation. There’s another step needed to transform an element in the inverse covariance matrix to the corresponding partial correlation. But perhaps surprisingly, information about how to do this transformation is in somewhat short supply.

After much searching, I found the details buried at about the eight-minute mark of the twenty-second (!) lecture of Ryan Tibshirani’s Statistical Machine Learning course taught at Carnegie Mellon in the Spring of 2017.

*(Fun fact: Ryan is the son of Robert Tibshirani, one of the authors of the original graphical lasso paper. Imagine the dinner conversations in that household.)*

On the off chance that you don’t have time to rip through a post-graduate course in statistical learning, here’s the critical information:

If \(R\) is a matrix of partial correlations and \(\Omega\) is the corresponding inverse covariance matrix, then the \(j^{th}, k^{th}\) element of \(R\) is given by:

\[R_{j,k} = \frac{-\Omega_{j,k}}{\sqrt{\Omega_{j,j}\Omega_{k,k}}}\]

What’s super interesting about this relationship is that the partial correlation is proportional to the *negative* of the corresponding element of the inverse covariance matrix (the numerator in the equation above). Thus, if one were to assume that the elements of the inverse correlation matrix directly corresponded to the partial correlation, one would end up with *anti-**correlation* where there was a positive correlation, and vice versa. Quite the error!

Now for the fun part.

We’re going to take a universe of US equities and apply the Graphical Lasso algorithm to estimate an inverse covariance matrix. Then, we’ll apply the transform given by the equation above to construct a sparse matrix of partial correlations.

We can think of this sparse matrix as representing a *network* with *edges* (connections) between *nodes *(stocks ) that have some sort of relationship, independent of any of the other variables.

Thinking of our matrix in this way leads us to the concept of a network graph which we can use as a visual tool to aid our understanding of and ability to reason about a large universe of stocks.

Our data consists of daily returns for the top roughly 1,100 US stocks by market cap between 2010 and 2019. Each returns series is standardised to have zero mean and unit variance.

Firstly, we group stocks into clusters based on loadings to statistical factors obtained from Principal Components Analysis (PCA) using the DBSCAN clustering algorithm. In our graph, we will colour stocks according to their cluster. All going well, we should see more connections between stocks within the same cluster.

We’ll gloss over the code for performing the clustering operations here – the subject of another blog post perhaps.

Next, we calculate a covariance matrix of stock returns.

Since I can’t share our stock price database, you’ll find the covariance matrix and the output of the clustering algorithm linked below (in exchange for the princely sum of your email address).

I’ll provide the code for you to reproduce the analysis from this point. We’ll use the `glasso`

package, which implements the Graphical Lasso algorithm, the `igraph`

package, which contains tools for building network graphs, and the `threejs`

and `htmlwidgets`

packages for creating interactive plots.

The first thing we need to do is load these and a few other packages and the data:

# install and load required packages required.packages <- c('glasso', 'colorRamps', 'igraph', 'RColorBrewer', 'threejs', 'htmlwidgets') new.packages <- required.packages[!(required.packages %in% installed.packages()[,"Package"])] if(length(new.packages)) install.packages(new.packages, repos='http://cran.us.r-project.org') library(glasso);library(colorRamps);library(igraph);library(RColorBrewer);library(threejs);library(htmlwidgets); # load data load("./clusters_covmat.RData")

This will load the covariance matrix into the variable `S`

and a dataframe of tickers and their corresponding clusters into the variable `cl`

.

Then, to apply the Graphical Lasso, we choose a value for `rho`

, which is the regularisation parameter that controls the degree of sparsity in the resulting inverse covariance matrix. Higher values lead to greater sparsity.

In our application, there is no “correct” value of `rho`

, but it can be tuned for your use case.

For instance, if you wanted to isolate the strongest relationships in your data you would choose a higher value `rho`

. If you were interested in preserving more tenuous connections, perhaps identifying stocks with connections to multiple groups, you’d choose a lower value of `rho`

. Finding a sensible value requires experimentation.

It’s also not a bad idea to check for symmetry in the resulting inverse covariance matrix. Assymmetry can arise due to numerical computation and rounding errors, which can cause problems later depending on what you want to do with the matrix.

# estimate precision matrix using glasso rho <- 0.75 invcov <- glasso(S, rho=rho) P <- invcov$wi colnames(P) <- colnames(S) rownames(P) <- rownames(S) # check symmetry if(!isSymmetric(P)) { P[lower.tri(P)] = t(P)[lower.tri(P)] }

Next, we calculate the partial correlation matrix and set the terms on the diagonal to zero – this prevents stocks having connections with *themselves* in the network graph we’ll be shortly constructing:

# calculate partial correlation matrix parr.corr <- matrix(nrow=nrow(P), ncol=ncol(P)) for(k in 1:nrow(parr.corr)) { for(j in 1:ncol(parr.corr)) { parr.corr[j, k] <- -P[j,k]/sqrt(P[j,j]*P[k,k]) } } colnames(parr.corr) <- colnames(P) rownames(parr.corr) <- colnames(P) diag(parr.corr) <- 0

Now if you run `View(parr.corr)`

in R Studio, you’ll see a very sparse partial correlation matrix. In fact, only about 6,000 of 1.35 million elements will contain non-zeroes! The non-zero elements represent a connection between two stocks, with the strength of the connection determined by the magnitude of the partial correlation. Here’s a snapshot that gives you an idea of the level of sparsity:

The partial correlation matrix can be used to build a network graph, where stocks are represented as nodes and non-zero elements are represented as edges between two stocks.

The `igraph`

package has some fantastic tools for building, manipulating and displaying graphs. We’ll only use a fraction of the package’s features here, but if you’re interested in getting to know it, check out Katya Ognyanova’s tutorial (it’s really excellent and got me up and running with `igraph`

in a matter of hours).

This next block of code constructs the network graph, assigns a colour to each node according to its cluster and drops any node with no connections.

# build network graph stock_graph <- graph_from_adjacency_matrix(parr.corr, mode="undirected", weighted=TRUE) # color by cluster V(stock_graph)$cluster <- as.numeric(cl$cluster) num_clusters <- length(unique(cl$cluster)) cols <- colorRamps::primary.colors(n=num_clusters+1) # hack to replace black colour with something else cols <- cols[2:length(cols)] # hack to replace black colour with something else V(stock_graph)$color <- cols[V(stock_graph)$cluster+1] # drop vertices with no edges isolated <- which(degree(stock_graph) == 0) stock_graph <- delete.vertices(stock_graph, isolated)

And finally, we can construct, save and view (in a browser) an interactive network graph:

# make interactive graph stock_graph_js <- graphjs(g=stock_graph, layout_with_fr(stock_graph, weights=30*E(stock_graph)$width, dim=3), # can choose other layout algorithms. `?layout` to get a list # vertex.shape = names(V(ig_wt)), # plot nodes as tickers rather than circles vertex.size=0.7, vertex.frame.color="white", vertex.frame.width=0.2, vertex.label=names(V(stock_graph)), # label nodes with tickers brush=TRUE, # enable highlighting clicked nodes and their connections showLabels=TRUE, # show node labels on hover edge.alpha=0.6, # edge opacity - lower helps when there are dense connections bg="black", # background colour main="Network graph from Graphical Lasso") # save graph graph_filename <- paste0("./network_graph_rho_", rho, ".html") saveWidget(stock_graph_js, file=graph_filename) # open in browser browseURL(graph_filename)

Here’s the resulting network graph. You can interact with it by:

- Clicking and dragging to rotate the graph
- Scrolling your mouse wheel to zoom in and out
- Hovering on a node to see the name of the stock
- Clicking on a node to highlight its connections

Pretty cool hey?

What insights do you get from exploring the graph? Most obviously we see:

- A large lime green cluster with strong intra-cluster partial correlations corresponding to banks, asset managers and insurance companies.
- Another darker green cluster with strong intra-cluster partial correlations corresponding to REITs.
- An orange cluster with strong intra-cluster partial correlations corresponding to utilities.
- Stocks coloured red were unclassified by our clustering process. But where a connection exists, it seems to make sense. For instance, the Graphical Lasso identified a connection between WYNN and LVS, which both operate resorts in Las Vegas.
- The smaller purple cluster consists of residential construction companies.
- There’s another darker purple cluster, but its intra-cluster connections are weaker, resulting in small, dispersed groups of stocks from this cluster. It consists of oil and gas companies.
- The small aqua coloured cluster corresponds to Canadian banks.
- Finally, the small peach coloured cluster corresponds to global banks listed as ADRs in the US.

This is all very interesting. But isn’t it telling us what we already know?

Yes and no.

The results all make intuitive sense. You don’t need a fancy algorithm to tell you that the casinos of Las Vegas are exposed to similar risk factors.

But what about stuff that’s been filtered from our universe of stocks? There are more residential construction companies than exist in that little purple cluster for instance. Is what’s been filtered valuable?

In our recent Machine Learning and Big Data Bootcamp, we built an equity pairs trading universe selection model that whittles down a list of several million potential pairs to around twenty to trade in the next period. One of the inputs to that model was a sparse partial correlation matrix estimated using Graphical Lasso.

We found that in fourteen of seventeen years, pairs with a non-zero partial correlation outperformed the wider universe of potential pairs in the next period:

This plot shows the unleveraged annualised returns to a simple pairs trading algorithm for pairs whose constituent stocks had a non-zero partial correlation (pinkish-red bars) versus returns to pairs in the wider universe (greenish-blue bars). For each year, the partial correlation matrix was estimated on the prior three years’ returns.

This is just one input to a bigger model, but clearly it’s a useful one. What’s more, we found that this added value beyond the obvious things such as pairing stocks in the same industry group.

We can also make a less sparse graph in order to explore inter-cluster relationships. In this case, we’ll use a smaller value of `rho`

(0.65) and we’ll drop any nodes that have zero or only one connection:

This is also interesting. We can see here certain stocks bridging the gaps between clusters. For instance, we have BHP (an Australian natural resources company listed as an ADR in the US) connected to other mining stocks, oil and gas companies and industrial manufacturers, such as CAT, which produces a lot of heavy equipment used in BHP’s vast operations.

We also see numerous connections to financial services companies, including basic materials companies, tech companies and industrials. Perhaps this is indicative of the central role of financial services in the modern economy.

Modern statistical learning techniques continue to transform the way in which we interact with data and the insights we can tease out. I find it quite astounding that the Graphical Lasso algorithm is able to take us from a noisy covariance matrix subject to all sorts of estimation errors, to a sparse matrix of partial correlations – the relationships between variables that remain when all the correlations with the other confounding variables have been stripped out.

One application is the universe selection problem in the context of equities pairs trading. Can you think of other use cases for the Graphical Lasso? I would love to hear them in the comments.

The post The Graphical Lasso and its Financial Applications appeared first on Robot Wealth.

]]>The post Kalman Filter Pairs Trading with Zorro and R appeared first on Robot Wealth.

]]>- Implemented a Kalman filter in R
- Implemented a simple pairs trading algorithm in Zorro
- Connected Zorro and R and exchanged data between the two platforms

In this fourth and final post, we’re going to put it all together and develop a pairs trading script that uses Zorro for all the simulation aspects (data handling, position tracking, performance reporting and the like) and our Kalman implementation for updating the hedge ratio in real-time.

You can download the exact script used in this post for free down at the very bottom. Let’s go!

Encapsulating our Kalman routine in a function makes it easy to call from our Zorro script – it reduces the call to a single line of code.

Save the following R script, which implements the iterative Kalman operations using data sent from Zorro, in your Zorro strategy folder:

###### KALMAN FILTER ####### delta <- 0.0001 Vw <- delta/(1-delta)*diag(2) Ve <- 0.01 R <- matrix(rep(0, 4), nrow=2) P <- matrix(rep(0, 4), nrow=2) kalman_iterator <- function(y, x, beta) { beta <- matrix(c(beta, 0), nrow=1) x <- matrix(c(x, 1), nrow=1) R <<- P + Vw # state cov prediction y_est <- x[1, ] %*% beta[1, ] # measurement prediction Q <- x[1, ] %*% R %*% x[1, ] + Ve # measurement variance prediction # error between observation of y and prediction e <- y - y_est K <- R %*% t(x) / drop(Q) # Kalman gain # state update beta <- beta[1, ] + K * e[1, ] P <<- R - K %*% x[1, ] %*% R return(list(beta[1], e, Q)) }

Recall that this implementation of the Kalman filter is *almost* parameterless. There are however two parameters that impact the speed at which the hedge ratio is updated by the Kalman algorithm,

deltain line 3 and

Vein line 5.

You can experiment with these parameters, but note that changes here will generally require changes in the Zorro script, such as the spacing between trade levels (more on this below).

Experimentation is a good thing (it’s useful to understand how these parameters impact the algorithm), but a nice, stable pair trade should be relatively robust to changes in these parameters. A pair that depends on just the right values of these parameters is one I’d think twice about trading.

Having said that, a sensible use of these parameters is to adjust the trade frequency of your pairs in line with transaction costs and risk management approach (not to optimise the strategy’s backtested performance).

Here’s our simple pairs trading script modified to call the Kalman iterator function to update the hedge ratio. To experiment with this Zorro script you’ll need:

- an Alpha Vantage API key (we load price history directly from Alpha Vantage)
- to set up trading conditions in a Zorro assets list (although if you don’t want to model costs, you don’t need to do this)

/* KALMAN PAIRS TRADING */ #include <r.h> #define Asset1 "GDX" #define Asset2 "GLD" #define MaxTrades 1 #define Spacing 1 // #define COSTS int Portfolio_Units = 1000; //units of the portfolio to buy/sell (more --> better fidelity to hedge ratio) var calculate_spread(var hedge_ratio) { var spread = 0; asset(Asset1); #ifndef COSTS Spread = Commission = Slippage = 0; #endif spread += priceClose(); #ifndef COSTS Spread = Commission = Slippage = 0; #endif asset(Asset2); spread -= hedge_ratio*priceClose(); return spread; } function run() { set(PLOTNOW); setf(PlotMode, PL_FINE); StartDate = 20060525; EndDate = 2019; BarPeriod = 1440; LookBack = 1; MaxLong = MaxShort = MaxTrades; // --------------------------------------- // Startup and data loading // --------------------------------------- if(is(INITRUN)) { // start R and source the kalman iterator function if(!Rstart("kalman.R", 2)) { print("Error - can't start R session!"); quit(); } // load data from Alpha Vantage string Name; int n = 0; while(Name = loop(Asset1, Asset2)) { assetHistory(Name, FROM_AV); n++; } } // --------------------------------------- // calculate hedge ratio and trade levels // --------------------------------------- asset(Asset1); #ifndef COSTS Spread = Commission = Slippage = 0; #endif vars prices1 = series(priceClose()); asset(Asset2); #ifndef COSTS Spread = Commission = Slippage = 0; #endif vars prices2 = series(priceClose()); static var beta; if(is(INITRUN)) beta = 0; // use kalman iterator to calculate paramters Rset("y", prices1[0]); Rset("x", prices2[0]); Rset("beta", beta); Rx("kalman <- kalman_iterator(y, x, beta)"); beta = Rd("kalman[[1]][1]"); vars e = series(Rd("kalman[[2]]")); var Q = Rd("kalman[[3]]"); // set up trade levels var Levels[MaxTrades]; int i; for(i=0; i<MaxTrades; i++) { Levels[i] = (i+1)*Spacing*sqrt(Q); } // --------------------------------------- // trade logic // --------------------------------------- // enter positions at defined levels for(i=0; i<MaxTrades; i++) { if(crossUnder(e, -Levels[i])) { asset(Asset1); Lots = Portfolio_Units; enterLong(); asset(Asset2); Lots = Portfolio_Units * beta; enterShort(); } if(crossOver(e, Levels[i])) { asset(Asset1); Lots = Portfolio_Units; enterShort(); asset(Asset2); Lots = Portfolio_Units * beta; enterLong(); } } // exit positions at defined levels for(i=1; i<MaxTrades-1; i++) { if(crossOver(e, -Levels[i])) { asset(Asset1); exitLong(0, 0, Portfolio_Units); asset(Asset2); exitShort(0, 0, Portfolio_Units * beta); } if(crossUnder(e, Levels[1])) { asset(Asset1); exitShort(0, 0, Portfolio_Units); asset(Asset2); exitLong(0, 0, Portfolio_Units * beta); } } // --------------------------------------- // plots // --------------------------------------- plot("beta", beta, NEW, PURPLE); if(abs(e[0]) < 20) { plot("error", e, NEW, BLUE); int i; for(i=0; i<MaxTrades; i++) { plot(strf("#level_%d", i), Levels[i], 0, BLACK); plot(strf("#neglevel_%d", i), -Levels[i], 0, BLACK); } } }

Like in our original vectorised backtest, this strategy is always in the market, simply entering a long position when the prediction error of the Kalman filter drops below its minus one standard deviation level and holding it until the prediction error crosses above its plus one standard deviation level, at which point the trade is reversed and a short position held.

This is not the optimal way to trade a spread, so we’ve left the door open to trade at multiple levels (line 9) with a user-specified spacing between levels (line 10).

*But before we get to that, there’s an important box we need to tick…*

Before we go further, we’ll aim to reproduce the results we got in the vectorised backtest we wrote in R way back in the first post of this series. That way, we can validate that our Zorro setup is working as expected.

This is an important (and easily overlooked) step because we’ll surely tinker with the strategy implementation (Zorro is *really *useful for efficiently doing that sort of experimentation), and we need to have confidence in our setup before we make any changes, do further research, and make decisions based on what we find.

*If you’ve ever had to rewind a whole bunch of research because of a faulty implementation at the outset, you know what I’m talking about….*

We’d expect some differences since Zorro provides an event-driven sequential backtester with very different assumptions to my hacky vectorised backtest. But we should see consistency in the hedge ratio, the positions taken, and the shape of the equity curve.

Here’s the Zorro output when we trade at one standard deviation of the prediction errors:

The hedge ratio, prediction errors, positions and equity curve shape all look very similar to the original vectorised R version.

We also ran a more aggressive version through our vectorised backtester, which traded at half a standard deviation of the prediction errors. Here’s what that looks like in Zorro (simply change line 10 to `#define Spacing 0.5`

):

Again, virtually identical to the output of our vectorised backtest.

*I’m calling that a win. Time to move on to some fun stuff. *

There are a bunch of things we can try with our pairs trading implementation. A few of them include:

- Exiting positions when the prediction error crosses zero
- Limiting the hold time of individual positions (that is, closing out early if the spread hasn’t converged fast enough)
- Entering at multiple levels
- Using more or less aggressive entry level spacing

Here’s an example of trading quite aggressively every 0.25 standard deviations of the prediction error, up to a maximum of eight levels:

Of course, when you trade like this you’re going to pay a ton in fees. But it gives you a taste of the sorts of things you can experiment with using this framework.

This concludes our mini-series on pairs trading with Zorro and R via the Kalman filter. We saw how you might:

- Implement the Kalman filter in R
- Implement a pairs trading algorithm in Zorro
- Make Zorro and R talk to one another
- Put it all together in an integrated pairs trading strategy

We’d love to know what you thought of the series in the comments. In particular, can you suggest any pairs you’d like to see us test? Can you suggest any improvements to the pairs trading algorithm itself? Are there any other approaches you’d like us to implement or test?

Thanks for reading!

The post Kalman Filter Pairs Trading with Zorro and R appeared first on Robot Wealth.

]]>The post Integrating R with the Zorro Backtesting and Execution Platform appeared first on Robot Wealth.

]]>The goal is to get the best of both worlds and use our dynamic hedge ratio within the Zorro script.

Rather than implement the Kalman filter in Lite-C, it’s much easier to make use of Zorro’s R bridge, which facilitates easy communication between the two applications. In this post, we’ll provide a walk-through of configuring Zorro and R to exchange data with one another.

While Zorro and R are useful as standalone tools, they have different strengths and weaknesses.

Zorro was built to simulate trading strategies, and it does this very well. It’s fast and accurate. It lets you focus on your strategies by handling the nuts and bolts of simulation behind the scenes. It implements various tools of interest to traders, such as portfolio optimization and walk-forward analysis, and was designed to prevent common bugs, like lookahead bias.

Zorro does a lot, **but it can’t do everything.**

An overlooked aspect of the software is its ability to integrate R and its thousands of add-on libraries. From machine learning and artificial intelligence to financial modeling, optimization, and graphics, R packages have been developed to cover all these fields and more. And since R is widely used in academia, when a researcher develops a new algorithm or tool it is often implemented as an open source R package long before it appears in commercial or other open-source software.

Zorro’s R bridge unlocks these tools for your trading applications and combines them with Zorro’s fast and accurate market simulation features.

In this post, I’ll show you how to set up and use Zorro’s R bridge. Once that’s out of the way, we’ll be in a position to put all the pieces together and run a simulation of our pairs trade that uses the Kalman filter we wrote for R.

Zorro’s R bridge is designed to enable a Zorro script to control and communicate with an R environment running on the same machine. The assumption is that the user will want to send market data (sometimes lots of it) from Zorro to R for processing, and then return the output of that processing, usually consisting of just one or a small number of results, back to Zorro.

Lite-C is generally much faster than R code, so it’s preferable to perform as much computation on the Zorro side as possible, reserving R for computations that are difficult or inconvenient to implement in Zorro. Certainly, you’ll want to avoid doing any looping in R. Having said that, vector and matrix operations are no problem for R, and might even run quicker than in Lite-c.

Zorro orders time series data differently to most platforms – newest elements *first*. R’s functions generally expect time series with newest elements *last*. Fortunately Zorro implements the `rev`

function for reversing the order of a time series, which we’ll need to use prior to sending data across to R. I’ll show you an example of how this works.

Finally, debugging R bridge functions requires a little care. For example, executing an R statement with a syntax error from Zorro will cause the R session to fall over and subsequent commands to also fail – sometimes silently. For basic debugging, you can return R output to Zorro’s GUI or use a debugging tool, as well as use an R bridge function for checking that the R sessions is still “alive” (more on these below). But it always pays to execute R commands in the R console before setting them loose from a Lite-C script.

Assuming you have Zorro installed, here’s a walk-through of configuring Zorro and R to talk to one another.

Go to http://cran.r-project.org. and install R.

Open *Zorro/Zorro.ini* (or *Zorro/ZorroFix.ini* if using the persistent version of the configuration file) and enter the path to *RTerm.exe* for the `RTermPath`

variable. This tells Zorro how to start an R session.

Here’s an example of the location of *RTerm.exe*:

And how the `RTermPath`

setting in *Zorro.ini* might look:

Of course, the path to *RTerm.exe* will be specific to your machine.

In *Zorro/Strategy*, you’ll find a script named *RTest.c*. Open a Zorro instance, select this script, and press *Test*. If R is installed correctly and your *Zorro.ini* settings are correct, you should get output that looks like this:

If that worked as expected, then you’re ready to incorporate R functionality in your Zorro scripts. If the test script failed, most likely the path specified in *Zorro.ini* is incorrect.

*Next, we’ll run through a brief tutorial with examples on using the R bridge functions. *

`r.h`

header file to a Zorro scriptTo use the R bridge in your script, you need to include the r.h header file. Simply add this line at the beginning of your Zorro script:

#include <r.h>

In order to use the other R bridge functions run

Rstart()in the Zorro script’s

INITRUN. Here’s the function’s general form:

Rstart(string source, int debuglevel)

Both parameters are optional.

sourceis an R file that is sourced at start up, and loads any predefined functions, variables or data that your R session will use.

We can also specify the optional

debuglevelargument, which takes an integer value of either 0, 1, or 2 (0 by default) defining the verbosity of any R output, such as errors and warnings:

**0:**output fatal errors only**1:**output fatal errors, warnings and notes**2:**output every R message (this is like the output you see in the R console).

You can use Microsoft’s Debug View tool to see the output of the R session. There’s a more convenient way to display the output of the R session directly in the Zorro GUI too – more on this shortly.

Rstart()returns zero if the R session could not be started, and returns non-zero otherwise. Therefore, we can use

Rstart()to check that the R session started.

This next script attempts to start a new R session via

Rstart(), but raises the alarm and quits if unsuccessful.

#include <r.h> function run() { if(!Rstart("", 2)) { print("Error - could not start R session!"); quit(); } }

Rrun()checks the status of the R session and returns 0 if the session has been terminated or has failed, 1 if the session is ready for input, and 2 if the session is busy with a computation or operation. Use it regularly!

The R session will terminate upon encountering any fatal error (which can arise from a syntax error, unexpected data, and other issues that can arise in real time). But here’s the thing:* if the R session is terminated, the R bridge simply stops sending messages and silently ignores further commands. *

That means that your script will only throw an error if some Lite-C computation depends on data that wasn’t received back from the R bridge.

It’s a bad idea to assume that this will be picked up, so use

Rrun()to check the status of your R connection – typically you’ll want to do this at the end of every bar in a backtest, and possibly prior to critical computations, raising an appropriate error when a failure is detected.

The script below builds on the previous example to also include a call to

Rrun()every bar:

#include <r.h> function run() { if(!Rstart("", 2)) { print("Error - could not start R session!"); quit(); } if(!Rrun()) { print("Error - R session has been terminated!"); quit(); } }

Rx(string code, int mode)is a powerful function – it enables the execution of a line of R code directly from a Lite-C script. We simply provide the R code as a string (the

codeargument, which can be up to 1,000 characters in length). Optionally, we can provide

modewhich specifies how

Rx()passes control back to Zorro during execution of

codein R.

Normally, the Zorro GUI is unresponsive while the R bridge is busy;

modecan modify this behaviour. It takes the following values:

**0:**Execute code synchronously (that is, freeze Zorro until the computation is finished). This is the default behaviour.**1:**Execute code asynchronously, returning immediately and continuing to execute the Lite-C script. Since the R bridge can only handle one request at a time, you’ll need to useRrun()

to determine when the next command can be sent to the R session. This is useful when you want to run R and Zorro computations in parallel.**2:**Execute code asynchronously, enabling the user to access the Zorro GUI buttons, and returning 1 whencode

has finished executing and 0 when an error is encountered or the**[Stop]**button on the Zorro GUI is pressed. This is useful when your R computations take a long time, and you think you might want to interrupt them with the**[Stop]**button.**3:**Execute code asynchronously, likemode = 2

, but also printing R output to Zorro’s message window. The verbosity of this output is controlled by thedebuglevel

argument toRstart()

; in order to output everything (that is, mimic the output of the R console), setdebuglevel

to 2. This is a convenient alternative to using the Debug Tool mentioned above.

Here’s a script that runs two lines of R code: one line that generates a vector of random normal numbers and calculates its mean; and another that prints the mean, returning the value to the Zorro GUI.

#include <r.h> function run() { if(!Rstart("", 2)) //enable verbose output { print("Error - could not start R session!"); quit(); } Rx("x <- mean(rnorm(100, 0, 1))", 0); //default mode: wait until R code completes Rx("print(x)", 3); //execute asynchronously, print output to debug view and Zorro GUI window if(!Rrun()) { print("Error - R session has been terminated!"); quit(); } }

Here’s the output in the Zorro GUI:

You can see that with every iteration of the `run`

function, Zorro tells R to generate a new vector of random numbers – hence the changing mean.

To send data from Zorro to R, use

Rset(string name, data_type, data_length).

On the R side, the data will be stored in a variable named

name.

The actual usage of

Rset()depends on what type of data is being sent from Zorro: a single int, a single float, or an array (or series) of float type variables. The latter can be sent to R as either a vector or a two-dimensional matrix.

When sending a single int or float to R, we simply specify the name of that variable.

For sending arrays, we need to specify a pointer to the array and either the number of elements (for sending the array to R as a vector) or the number of rows and columns (for sending the array to R as a matrix).

Specifying a pointer is not as scary as it sounds; in Lite-C we can simply use the name of the array or series, as these are by definition pointers to the actual variables.

Here are some examples of sending the different data types from Zorro to R:

#include <r.h> function run() { if(!Rstart("", 2)) //enable verbose output { print("Error - could not start R session!"); quit(); } // make some variables int today = dow(); var last_return = 0.003; var my_params[5] = {2.5, 3.0, 3.5, 4.0, 4.5}; // send those variables to R Rset("my_day", today); Rset("last_ret", last_return); Rset("params", my_params, 5); //specify number of elements // operate on those variables in the R session Rx("if(my_day == 1) x <- last_ret * params[1] else x <- 0", 0); //note params[1] is my_params[0] due to R's 1-based indexing and C's 0-based Rx("print(x)", 3); if(!Rrun()) { print("Error - R session has been terminated!"); quit(); } }

In lines 11-14, we create some arbitrary variables named

today(an int),

last_return(a float) and

my_params(an array of float). In lines 16-19, we send those variables to the R session, assigning them to R objects named

my_day,

last_ret, and

paramsrespectively. When we send the array

my_paramsto the R session, we have to specify the number of elements in the array.

In line 22, we perform an operation on the variables in our R session. Note that R’s indexing is one-based, while C’s is zero-based, so if we want to access the value associated with

my_params[0]in the R session, we need to use

params[1].

Here’s an example of the output:

Sending price data (or other time series, such as returns, indicators, and the like) follows a process like the one shown above, but there are one or two issues you need to be aware of.

First, during the lookback period, the values of such time series are undefined. Sending an undefined value via the R bridge will cause a fatal error and the subsequent termination of the R session. To get around this issue, we can wrap our calls to

Rset()in an

ifcondition which evaluates to

Trueoutside the lookback period:

if(!is(LOOKBACK)).

The other problem is that Zorro’s time series are constructed with the newest values *first*. R functions expect time series data in chronological order with the newest elements *last*. That means that we need to reverse the order of our Zorro time series before sending them to R.

This is fairly painless, since Zorro implements the

rev()function for that very purpose. Simply provide

rev()with the time series to be reversed, and optionally the number of values to be sent to R (if this argument is omitted,

LookBackvalues are used instead).

Here’s an example of sending price data to R that deals with these two issues:

#include <r.h> function run() { if(!Rstart("", 2)) //enable verbose output { print("Error - could not start R session!"); quit(); } vars Close = series(priceClose()); int size = 20; vars revClose = rev(Close, size); if(!is(LOOKBACK)) { printf("\n#########\nZorro's most recent close:\n%.5f", Close[0]); Rset("closes", revClose, size); Rx("last_close <- round(closes[length(closes)],5)", 0); printf("\nR's most recent close:\n"); Rx("print(last_close, 5)", 3); } if(!Rrun()) { print("Error - R session has been terminated!"); quit(); } }

Here’s an example of the output:

The three functions

Ri(), Rd(), Rv()evaluate a given R expression, much like

Rx(), but they return the result of the expression back to the Zorro session as either an int, float, or vector respectively. We can supply any variable, valid R code or function to

Ri(), Rd(), Rv(), so long as it evaluates to the correct variable type.

Ri()and

Rd()work in much the same way: we only need to supply an R expression as a string, and the functions return the result of the expression. This means that in the Lite-C script, we can set a variable using the output of

Ri()or

Rd().

For example, to define the variable

my_varand use it to store the mean of the R vector

my_data, we would do:

var my_var = Rd("mean(my_data)");

Rv()works in a slightly different way. We supply as arguments an R expression that evaluates to a vector, and we also supply a pointer to the Lite-C var array to be filled with the results of the R expression. We also supply the number of elements in the vector.

Here’s an example where we fill the float array

my_vectorwith the output of R’s

rnorm()function (which produces a vector of normally distributed random variables of a given length, mean and standard deviation):

var my_vector[10]; Rv("rnorm(10, 0, 1)", my_vector, 10);

Here’s an example where we put both of these together – we populate a vector in our Lite-C script with some random numbers generated in R. Then we send that vector back to R to calculate the mean before printing the results. Of course this is a very convoluted way to get some random numbers and their mean, but it illustrates the point:

#include <r.h> function run() { if(!Rstart("", 2)) //enable verbose output { print("Error - could not start R session!"); quit(); } var my_vector[10]; // initialise array of float if(!is(LOOKBACK)) { Rv("rnorm(10, 0, 1)", my_vector, 10); Rset("my_data", my_vector, 10); var my_mean = Rd("mean(my_data)"); int i; printf("\n#################"); for(i=0; i<10; i++) { printf("\nmy_vector[%i]: %.3f", i, my_vector[i]); } printf("\nmean: %.3f", my_mean); } if(!Rrun()) { print("Error - R session has been terminated!"); quit(); } }

And here’s the output:

The intent of Zorro’s R bridge is to:

- Facilitate the sending of large amounts of data from Zorro to R,
- Enable analysis of this data in R by executing R code from the Lite-C script, and
- Return single numbers or vectors from R to Zorro.

With that in mind, it makes sense to do as much of the data acquisition, cleaning and processing on the Lite-C side as possible. Save the R session for analysis that requires the use of specialized packages or functions not available in Zorro.

In particular, avoid executing loops in R *(these can be painfully slow). *But if operations can be vectorized, they may be more efficiently performed in R.

It is wise to test the R commands you supply to

Rx(),

Ri(),

Rd(), and

Rv()in an R console prior to running them in a Lite-C script. Any syntax error or bad data will cause the R session to terminate and all subsequent R commands to fail – potentially without raising a visible error. For that reason, use the

Rrun()function regularly (at least once per bar) and keep on eye on the Debug View tool’s output, or the Zorro GUI.

A frozen Zorro instance is often indicative of an incomplete R command, such as a missing bracket. Such a mistake will not throw an error, but R will wait for the final bracket, causing Zorro to freeze.

Another common error is to attempt to load an R package that hasn’t been installed. This will cause the R session to terminate, so make sure your required packages are all installed before trying to load them. The source of the resulting error may not be immediately obvious, so keep an eye on the Debug View tool’s output.

Depending on your setup, *the packages available to your R terminal may not be the same as those available in your R Studio environment* (if you’re using that particular IDE).

Here’s a short R script that specifies some arbitrary required packages, checks if they are installed, and attempts to install them from CRAN if they are not already installed:

required.packages <- c('deepnet', 'caret', 'kernlab') new.packages <- required.packages[!(required.packages %in% installed.packages()[,"Package"])] if(length(new.packages)) install.packages(new.packages, repos='https://cran.us.r-project.org')

You can include this script in the file specified as the

sourceparameter to

Rstart()to ensure that your required packages are always present.

Another common issue with the R bridge arises from passing *backslashes* in file names from Lite-C to R. R uses *forward* *slashes* instead. You can modify these manually, or use Zorro’s

slash()function, which automatically converts all backslashes in a string to forward slashes. For example,

slash(ZorroFolder)returns the file path to the Zorro folder as a string, with forward slashes instead of backslashes.

OK, that was a lengthy tutorial, but it will be worth it!

So far we’ve used fairly simple R functions – stuff that you can easily do in Lite-C, like calcualting the mean of a bunch of numbers. But in the next post, we’ll put together our Zorro pairs tradng script that makes use of the Kalman filter that we wrote in R.

**More importantly, if you can master the R bridge functions we’ve discussed, you’ll be able to use any R tool directly in your trading scripts.**

The post Integrating R with the Zorro Backtesting and Execution Platform appeared first on Robot Wealth.

]]>The post Pairs Trading in Zorro appeared first on Robot Wealth.

]]>You know, light reading…

We saw that while R makes it easy to implement a relatively advanced algorithm like the Kalman filter, there are drawbacks to using it as a backtesting tool.

Setting up anything more advanced than the *simplest* possible vectorised backtesting framework is tough going and error-prone. Plus, it certainly isn’t simple to experiment with strategy design – for instance, incorporating costs, trading at multiple levels, using a timed exit, or incorporating other trade filters.

To be fair, there are good native R backtesting solutions, such as Quantstrat. But in my experience none of them let you experiment as efficiently as the Zorro platform.

And as an independent trader, the ability to move fast – writing proof of concept backtests, invalidating bad ideas, exploring good ones in detail, and ultimately moving to production efficiently – is quite literally a superpower.

*I’ve already invalidated 3 ideas since starting this post*

The downside with Zorro is that it would be pretty nightmarish implementing a Kalman filter in its native Lite-C code. But thanks to Zorro’s R bridge, I can use the R code for the Kalman filter that I’ve already written, with literally only a couple of minor tweaks. *We can have the best of both worlds!*

This post presents a script for a pairs trading algorithm using Zorro. We’ll stick with a static hedge ratio and focus on the pairs trading logic itself. In the next post, I’ll show you how to configure Zorro to talk to R and thus make use of the Kalman filter algorithm.

*Let’s get to it. *

Even the briefest scan of the pairs trading literature reveals many approaches to constructing spreads. For example, using:

- Prices
- Log-prices
- Ratios
- Factors
- Cointegration
- Least squares regression
- Copulas
- State space models

Ultimately, the goal is to find a spread that is both mean-reverting and volatile enough to make money from.

In my view, *how* you do that is much less important than it’s ability to make money. From personal experience, I know that the tendency is to get hung up on the “correct” way to implement a pairs trade. Such a thing doesn’t exist — I’ve seen money-printing pairs trading books that younger me, being more hung up on “correctness”, would have scoffed at.

Instead, understand that pairs trading is ultimately a numbers game and that universe selection is more important than the specifics of the algorithm. Sure, you can tweak your implementation to squeeze a little more out of it, and even find pockets of conditional or seasonal mean-reversion, but the specifics of the implementation are unlikely to be the ultimate source of alpha.

Anyway, that’s for you to mull over and keep in mind as you read this series. Right now, we’re just going to present one version of a pairs trade in Zorro.

This is a pairs trade that uses a price-based spread for its signals. First, here’s the code that calculates the spread, given two tickers `Y`

and `X`

(lines 5 – 6) and a hedge ratio, `beta`

(line 39). The spread is simply \(Y – \beta X\)

Here’s the code:

/* Price-based spread in Zorro */ #define Y "GDX" #define X "GLD" var calculate_spread(var hedge_ratio) { var spread = 0; asset(Y); spread += priceClose(); asset(X); spread -= hedge_ratio*priceClose(); return spread; } function run() { set(PLOTNOW); StartDate = 20100101; EndDate = 20191231; BarPeriod = 1440; LookBack = 100; // load data from Alpha Vantage in INITRUN if(is(INITRUN)) { string Name; while(Name = loop(Y, X)) { assetHistory(Name, FROM_AV); } } // calculate spread var beta = 0.4; vars spread = series(calculate_spread(beta)); // plot asset(Y); var asset1Prices = priceClose(); asset(X); plot(strf("%s-LHS", Y), asset1Prices, MAIN, RED); plot(strf("%s-RHS", X), priceClose(), 0|AXIS2, BLUE); plot("spread", spread, NEW, BLACK); }

Using GDX and GLD as our Y and X tickers respectively and a hedge ratio of 0.4, Zorro outputs the following plot:

The spread looks like it was reasonably stationary during certain subsets of the simulation, but between 2011 and 2013 it trended – not really a desirable property for a strategy based on mean-reversion.

Even this period in late 2013, where one could imagine profiting from a mean-reversion strategy, the spread hasn’t been very well behaved. The buy and sells levels are far from obvious in advance:

One way to tame the spread is to apply a rolling z-score transformation. That is, take a window of data, say 100 days, and calculate its mean and standard deviation. The z-score of the next point is the raw value less the window’s mean, divided by its standard deviation. Applying this in a rolling fashion is a one-liner in Zorro:

vars ZScore = series(zscore(spread[0], 100));

Our z-scored spread then looks like this:

The z-scored spread has some nice properties. In particular, it tends to oscillate between two extrema, eliminating the need to readjust buy and sell levels (although we’d need to decide on the actual values to use). On the other hand, it does introduce an additional parameter, namely the window length used in its calculation.

To implement the rest of our pairs trade, we need to decide the z-score levels at which to trade and implement the logic for buying and selling the spread.

Zorro makes that fairly easy for us. Here’s the complete backtesting framework code:

/* Price-based spread trading in Zorro */ #define Y "GDX" #define X "GLD" #define MaxTrades 5 #define Spacing 0.5 // #define COSTS int ZSLookback = 100; int Portfolio_Units = 100; //units of the portfolio to buy/sell (more --> better fidelity to dictates of hedge ratio) var calculate_spread(var hedge_ratio) { var spread = 0; asset(Y); #ifndef COSTS Spread = Commission = Slippage = 0; #endif spread += priceClose(); asset(X); #ifndef COSTS Spread = Commission = Slippage = 0; #endif spread -= hedge_ratio*priceClose(); return spread; } function run() { set(PLOTNOW); setf(PlotMode, PL_FINE); StartDate = 20100101; EndDate = 20191231; BarPeriod = 1440; LookBack = ZSLookback; MaxLong = MaxShort = MaxTrades; // load data from Alpha Vantage in INITRUN if(is(INITRUN)) { string Name; while(Name = loop(Y, X)) { assetHistory(Name, FROM_AV); } } // calculate spread var beta = 0.4; vars spread = series(calculate_spread(beta)); vars ZScore = series(zscore(spread[0], 100)); // set up trade levels var Levels[MaxTrades]; int i; for(i=0; i<MaxTrades; i++) { Levels[i] = (i+1)*Spacing; } // ------------------------------- // trade logic // ------------------------------- // exit on cross of zero line if(crossOver(ZScore, 0) or crossUnder(ZScore, 0)) { asset(X); exitLong(); exitShort(); asset(Y); exitLong(); exitShort(); } // entering positions at Levels for(i=0; i<=MaxTrades; i++) { if(crossUnder(ZScore, -Levels[i])) // buying the spread (long Y, short X) { asset(Y); Lots = Portfolio_Units; enterLong(); asset(X); Lots = Portfolio_Units * beta; enterShort(); } if(crossOver(ZScore, Levels[i])) // shorting the spread (short Y, long X) { asset(Y); Lots = Portfolio_Units; enterShort(); asset(X); Lots = Portfolio_Units * beta; enterLong(); } } // exiting positions at Levels for(i=1; i<=MaxTrades-1; i++) { if(crossOver(ZScore, -Levels[i])) // covering long spread (exiting long Y, exiting short X) { asset(Y); exitLong(0, 0, Portfolio_Units); asset(X); exitShort(0, 0, Portfolio_Units * beta); } if(crossUnder(ZScore, Levels[1])) // covering short spread (exiting short Y, exiting long X) { asset(Y); exitShort(0, 0, Portfolio_Units); asset(X); exitLong(0, 0, Portfolio_Units * beta); } } // plots if(!is(LOOKBACK)) { plot("zscore", ZScore, NEW, BLUE); int i; for(i=0; i<MaxTrades; i++) { plot(strf("#level_%d", i), Levels[i], 0, BLACK); plot(strf("#neglevel_%d", i), -Levels[i], 0, BLACK); } plot("spread", spread, NEW, BLUE); } }

The trade levels are controlled by the `MaxTrades`

and `Spacing`

variables (lines 7 – 8). These are implemented as `#define`

statements to make it easy to change these values, enabling fast iteration.

As implemented here, with `MaxTrades`

equal to 5 and `Spacing`

equal to 0.5, Zorro will generate trade levels every 0.5 standard deviations above and below the zero line of our z-score.

The generation of the levels happens in lines 57 – 63.

The trade logic is quite simple:

- Buy the spread if the z-score crosses under a negative level
- Short the spread if the z-score crosses over a positive level
- If we’re long the spread, cover a position if the z-score crosses over a negative level
- If we’re short the spread, cover a position if the z-score crosses under a positive level
- Cover whenever z-score crosses the zero line

By default, we’re trading 100 units of the spread at each level. We’re trading in and out of the spread as the z-score moves around and crosses our levels. If the z-score crosses more than one level in a single period, we’d be entering positions for each crossed level at market.

Essentially, it’s a bet on the mean-reversion of the z-scored spread translating into profitable buy and sell signals in the underlyings.

The strategy returns a Sharpe ratio of about 0.6 before costs (you can enable costs by uncommenting `#define COSTS`

in line 9, but you’ll need to set up a Zorro assets list with cost details, or tell Zorro about costs via script) and the following equity curve:

There you have it – a Zorro framework for price-based pairs trading. More than this particular approach to pairs trading itself, I hope that I’ve demonstrated Zorro’s efficiency for implementing such frameworks quickly. And once they’re implemented, you can run experiments and iterate on the design, as well as the utility of the trading strategy, efficiently.

For instance, it’s trivial to:

- Add more price levels, tighten them up, or space them out
- Get feedback on the impact of changing the z-score window length
- Explore what happens when you change the hedge ratio
- Change the simulation period
- Swap out GLD and GDX for other tickers

You can even run Zorro on the command line and pass most of the parameters controlling these variables as command line arguments – which means you can write a batch file to run hundreds of backtests and really get into some serious data mining – if that’s your thing.

All that aside, in the next post I want to show you how to incorporate the dynamic estimate of the hedge ratio into our Zorro pairs trading framework by calling the Kalman filter implemented in R directly from our Zorro script.

**You can grab the code for the Kalman Filter we used in the previous post for free below:**

The post Pairs Trading in Zorro appeared first on Robot Wealth.

]]>The post Kalman Filter Example:<br>Pairs Trading in R appeared first on Robot Wealth.

]]>Anyone who’s tried pairs trading will tell you that real financial series don’t exhibit truly stable, cointegrating relationships.

If they did, pairs trading would be the easiest game in town. But the reality is that relationships are constantly evolving and changing. At some point, we’re forced to make uncertain decisions about how best to capture those changes.

One way to incorporate both uncertainty and dynamism in our decisions is to use the Kalman filter for parameter estimation.

The Kalman filter is a state space model for estimating an unknown (‘hidden’) variable using observations of related variables and models of those relationships. The Kalman filter is underpinned by Bayesian probability theory and enables an estimate of the hidden variable in the presence of noise.

There are plenty of tutorials online that describe the mathematics of the Kalman filter, so I won’t repeat those here (this article is a wonderful read). Instead, this Kalman Filter Example post will show you how to implement the Kalman filter framework to provide a * dynamic estimate of the hedge ratio in a pairs trading strategy*. I’ll provide just enough math as is necessary to follow the implementation.

For this Kalman Filter example, we need four variables:

- A vector of our observed variable
- A vector of our hidden variable
- A state transition model (which describes how the hidden variable evolves from one state to the next)
- An observation model (a matrix of coefficients for the other variable – we use a hedge coefficient and an intercept)

For our hedge ratio/pairs trading application, the observed variable is one of our price series \(p_1\) and the hidden variable is our hedge ratio, \(\beta\). The observed and hidden variables are related by the familiar spread equation: \[p_1 = \beta * p_2 + \epsilon\] where \(\epsilon\) is noise (in our pairs trading framework, we are essentially making bets on the mean reversion of \(\epsilon\)). In the Kalman framework, the other price series, \(p_2\) provides our observation model.

We also need to define a state transition model that describes the evolution of \(\beta\) from one time period to the next. If we assume that \(\beta\) follows a random walk, then our state transition model is simply \[\beta_t = \beta_{t-1} + \omega\]

Here’s the well-known iterative Kalman filter algorithm.

For every time step:

- Predict the next state of the hidden variable given the current state and the state transition model
- Update the state covariance prediction
- Predict the next value of the observed variable given the prediction for the hidden variable and the observation model
- Update the measured covariance prediction
- Calculate the error between the observed and predicted values of the observed variable
- Calculate the Kalman gain
- Update the estimate of the hidden variable
- Update the state covariance prediction

To start the iteration, we need initial values for the covariances of the measurement and state equations. Methods exist to estimate these from data, but for our purposes we will start with some values that result in a relatively slowly changing hedge ratio. To make the hedge ratio change faster, increase the values of

deltaand

Vein the R code below. The initial estimates of these values are as close to ‘parameters’ that we have in our Kalman filter framework.

Here’s some R code for implementing the Kalman filter.

The two price series used are daily adjusted closing prices for the “Hello world” of pairs trading: GLD and GDX (you can download the data at the end of this post).

First, read in and take a look at the data:

library(xts) path <- "C:/Path/To/Your/Data/" assets <- c("GLD", "GDX") df1 <- xts(read.zoo(paste0(path, assets[1], ".csv"), tz="EST", format="%Y-%m-%d", sep=",", header=TRUE)) df2 <- xts(read.zoo(paste0(path, assets[2], ".csv"), tz="EST", format="%Y-%m-%d", sep=",", header=TRUE)) xy <- merge(df1$Close, df2$Close, join="inner") colnames(xy) <- assets plot(xy, legend.loc=1)

Here’s what the data look like:

Looks OK at first glance.

Here’s the code for the iterative Kalman filter estimate of the hedge ratio:

x <- xy[, assets[1]] y <- xy[, assets[2]] x$int <- rep(1, nrow(x)) delta <- 0.0001 Vw <- delta/(1-delta)*diag(2) Ve <- 0.001 R <- matrix(rep(0, 4), nrow=2) P <- matrix(rep(0, 4), nrow=2) beta <- matrix(rep(0, nrow(y)*2), ncol=2) y_est <- rep(0, nrow(y)) e <- rep(0, nrow(y)) Q <- rep(0, nrow(y)) for(i in 1:nrow(y)) { if(i > 1) { beta[i, ] <- beta[i-1, ] # state transition R <- P + Vw # state cov prediction } y_est[i] <- x[i, ] %*% beta[i, ] # measurement prediction Q[i] <- x[i, ] %*% R %*% t(x[i, ]) + Ve # measurement variance prediction # error between observation of y and prediction e[i] <- y[i] - y_est[i] K <- R %*% t(x[i, ]) / Q[i] # Kalman gain # state update beta[i, ] <- beta[i, ] + K * e[i] P = R - K %*% x[i, ] %*% R } beta <- xts(beta, order.by=index(xy)) plot(beta[2:nrow(beta), 1], type='l', main = 'Kalman updated hedge ratio') plot(beta[2:nrow(beta), 2], type='l', main = 'Kalman updated intercept')

And here is the resulting plot of the dynamic hedge ratio:

The value of this particular Kalman filter example is immediately apparent – you can see how drastically the hedge ratio changed over the years.

We could use that hedge ratio to construct our signals for a trading strategy, but we can actually use the other by-products of the Kalman filter framework to generate them directly *(hat tip to Ernie Chan for this one):*

The prediction error (

ein the code above) is equivalent to the deviation of the spread from its predicted value. Some simple trade logic could be to buy and sell our spread when this deviation is very negative and positive respectively.

We can relate the actual entry levels to the standard deviation of the prediction error. The Kalman routine also computes the standard deviation of the error term for us: it is simply the square root of

Qin the code above.

Here’s a plot of the trading signals at one standard deviation of the prediction error (we need to drop a few leading values as the Kalman filter takes a few steps to warm up):

# plot trade signals e <- xts(e, order.by=index(xy)) sqrtQ <- xts(sqrt(Q), order.by=index(xy)) signals <- merge(e, sqrtQ, -sqrtQ) colnames(signals) <- c("e", "sqrtQ", "negsqrtQ") plot(signals[3:length(index(signals))], ylab='e', main = 'Trade signals at one-standard deviation', col=c('blue', 'black', 'black'), lwd=c(1,2,2))

Cool! Looks OK, except the number of signals greatly diminishes in the latter half of the simulation period. Later, we might come back and investigate a more aggressive signal, but let’s press on for now.

At this point, we’ve got a time series of trade signals corresponding to the error term being greater than one standard deviation from its (estimated) mean. We could run a vectorised backtest by calculating positions corresponding to these signals, then determine the returns of holding those positions.

In fact, let’s do that next:

# vectorised backtest sig <- ifelse((signals[1:length(index(signals))]$e > signals[1:length(index(signals))]$sqrtQ) & (lag.xts(signals$e, 1) < lag.xts(signals$sqrtQ, 1)), -1, ifelse((signals[1:length(index(signals))]$e < signals[1:length(index(signals))]$negsqrtQ) & (lag.xts(signals$e, 1) > lag.xts(signals$negsqrtQ, 1)), 1, 0)) colnames(sig) <- "sig" ## trick for getting only the first signals sig[sig == 0] <- NA sig <- na.locf(sig) sig <- diff(sig)/2 plot(sig) ## simulate positions and pnl sim <- merge(lag.xts(sig,1), beta[, 1], x[, 1], y) colnames(sim) <- c("sig", "hedge", assets[1], assets[2]) sim$posX <- sim$sig * -1000 * sim$hedge sim$posY <- sim$sig * 1000 sim$posX[sim$posX == 0] <- NA sim$posX <- na.locf(sim$posX) sim$posY[sim$posY == 0] <- NA sim$posY <- na.locf(sim$posY) pnlX <- sim$posX * diff(sim[, assets[1]]) pnlY <- sim$posY * diff(sim[, assets[2]]) pnl <- pnlX + pnlY plot(cumsum(na.omit(pnl)), main="Cumulative PnL, $")

*Just a quick explanation of my hacky backtest…*

The ugly nested

`ifelse`

statement in line 2 creates a time series of trade signals where sells are represented as -1, buys as 1 and no signal as 0. The buy signal is the prediction error crossing under its -1 standard deviation from above; the sell signal is the prediction error crossing over its 1 standard deviation from below.The problem with this signal vector is that we can get consecutive sell signals and consecutive buy signals. We don’t want to muddy the waters by holding more than one position at a time, so we use a little trick in lines 7 – 10 to firstly replace any zeroes with

`NA`

, and then use the`na.locf`

function to fill forward the`NA`

values with the last real value. We then recover the original (non-consecutive) signals by taking the`diff`

and dividing by 2.If that seems odd, just write down on a piece of paper a few signals of -1, 1 and 0 in a column and perform on them the operations described. You’ll quickly see how this works.

Then, we calculate our positions in each asset according to our spread and signals, taking care to lag our signals so that we don’t introduce look-ahead bias. We’re trading 1,000 units of our spread per trade. Our estimated profit and loss is just the sum of the price differences multiplied by the positions in each asset.

Here’s the result:

Looks interesting!

But recall that our trading signals were few and far between in the latter half of the simulation? If we plot the signals, we see that we were actually holding the spread for well over a year at a time:

I doubt we’d want to trade the spread this way, so let’s make our signals more aggressive:

# more aggressive trade signals signals <- merge(e, .5*sqrtQ, -.5*sqrtQ) colnames(signals) <- c("e", "sqrtQ", "negsqrtQ") plot(signals[3:length(index(signals))], ylab='e', main = 'Trade signals at one-standard deviation', col=c('blue', 'black', 'black'), lwd=c(1,2,2))

Better! A smarter way to do this would probably be to adapt the trade level (or levels) to the recent volatility of the spread – I’ll leave that as an exercise for you.

These trade signals lead to this impressive and highly dubious equity curve:

Why is it dubious?

Well, you probably noticed that there are some pretty out-there assumptions in this backtest. To name the most obvious:

- We’re trading at the daily closing price with no market impact or slippage
- We’re trading for free

My gut feeling is that this would need a fair bit of work to cover costs of trading – but that gets tricky to assess without a more accurate simulation tool.

You can see that it’s a bit of a pain to backtest – particularly if you want to incorporate costs. To be fair, there are native R backtesting solutions that are more comprehensive than my quick-n-dirty vectorised version. But in my experience none of them lets you move quite as fast as the Zorro platform, which also allows you to go from backtest to live trading with almost the click of a button.

You can see that R makes it quite easy to incorporate an advanced algorithm* (well, at least I think it’s advanced; our clever readers probably disagree).* But tinkering with the strategy itself – for instance, incorporating costs, trading at multiple standard deviation levels, using a timed exit, or incorporating other trade filters – is a recipe for a headache, not to mention a whole world of unit testing and bug fixing.

On the other hand, Zorro makes tinkering with the trading aspects of the strategy easy. Want to get a good read on costs? That’s literally a line of code. Want to filter some trades based on volatility? Yeah, you might need two lines for that. What about trading the spread at say half a dozen levels and entering and exiting both on the way up and on the way down? OK, you might need four lines for that.

The downside with Zorro is that it would be pretty nightmarish implementing a Kalman filter in its native Lite-C code. But thanks to Zorro’s R bridge, I can use the R code for the Kalman filter example that I’ve already written, with literally only a couple of minor tweaks. We can have the best of both worlds.

*Which leads to my next post…*

In Kalman Filter Example part 2, I’ll show you a basic pairs trading script in Zorro, using a more vanilla method of calculating the hedge ratio. After that, I’ll show you how to configure Zorro to talk to R and thus make use of the Kalman filter algorithm.

*I’d love to know if this series is interesting for you, and what else you’d like to read about on Robot Wealth. Let us know in the comments.
*

The post Kalman Filter Example:<br>Pairs Trading in R appeared first on Robot Wealth.

]]>The post Pattern Recognition with the Frechet Distance appeared first on Robot Wealth.

]]>*ah, I see a blue star pattern on my chart… a good omen.*

The problem is that such an approach is *inherently subjective* since price action almost never matches perfectly with the idealized version of price patterns you see in every beginner’s guide to trading. It is up to you, the individual, to determine whether a particular chart formation matches closely enough with a particular pattern for it to be considered valid.

This is quite tricky! It’s *very difficult* to codify a trading system based on their use. By extension, it is difficult to test the efficacy of these patterns in an objective, evidence-based manner.

That won’t stop smart people from trying, though. An attempt to do just this was made by MIT’s Andrew Lo, Harry Mamaysky and Jian Wang back in 20001and, perhaps surprisingly, they found statistically significant evidence that *some* patterns provide useful incremental information in *some *markets.

Lo, Mamaysky and Wang were also generally enthusiastic about using automated detection methods. There are many possible approaches to pattern detection algorithms, of which Zorro implements one: a function that calculates the Frechet distance.

*So, l**et’s dive in and explore it!*

**The Frechet distance between two curves is a measure of their similarity —** it’s often described like so:

Suppose a man is walking his dog and that he is forced to walk on a particular path and his dog on another path. Both the man and the dog are allowed to control their speed independently but are not allowed to go backwards. Then, the Fréchet distance of the two paths is the minimal length of a leash that is necessary to keep man and dog joined throughout their walk.

*there are a lot more variables in the real world….*

If the Frechet distance between two curves is small, it follows that the curves are similar. Conversely, a large Frechet distance implies that the curves are not similar.

So, we can leverage the Frechet distance as a pattern detection algorithm by comparing sections of the price curve to a curve corresponding to a pattern of interest, for example, a triangle. A small Frechet distance implies that the section of the price curve that was analyzed is similar to the pre-defined pattern.

There have been a whole bunch of algorithms proposed over the years for calculating Frechet distance (it was first described in 1906), and Zorro implements a simple variant that enables the comparison of some part of a price curve with a known and previously described pattern. Zorro’s

frechet()function returns a number between approximately 0 and 80 that measures the similarity of the part of the price curve being analyzed and the pattern.

**Note that this is proportional to the inverse of the Frechet distance, in that a larger similarity measure implies a smaller Frechet distance. **

frechet()takes the following arguments:

- A series (usually asset prices) to be compared with a predefined pattern
- An integer
TimeFrame

which sets the number of price bars to use in the comparison, that is, the horizontal length of the pattern in the price curve (setting this to zero tells Zorro to simply use the same length as the predefined pattern). - A var,
Scale

, specifying the vertical size of the pattern in the price chart. Setting this to a negative number inverts the pattern. - An array of positive numbers specifying the shape of the pattern to be detected. The final value of the array must be zero, which is used by the algorithm to signal the termination of the pattern.

There are several complications and considerations to be aware of in setting these arguments, so let’s go through each of them in more detail, starting with the array specifying the shape of the pattern.

If we want to detect chart patterns, the first thing we need to define is the shape of that pattern. Note that in describing our pattern in an array, we only need to be concerned with its *shape. *We can deal with its *size* (both horizontally and vertically) using other

frechet()arguments. Therefore don’t focus too much on the absolute values of the numbers that describe the pattern – their relative values are much more important here.

To define a pattern in an array, think of an \(x, y\) coordinate plane. The array indexes are the \(x\) values; the numbers stored in each index are the corresponding \(y\) values. We then map our pattern as a series of \(x,y\) pairs.

Here’s an example for a triangle pattern:

Remembering that zero terminates the pattern, the corresponding array would consist of the numbers 1, 8, 2, 7, 3, 6, 4, 5, 0. We would define such an array as

var Triangle[9] = {1, 8, 2, 7, 3, 6, 4, 5, 0};

The obvious question that arises from this approach is how well does the algorithm detect patterns that we would consider a triangle, but which deviate from the idealized triangle shown above?

For example, what about asymmetry in the legs of the triangle? That is, what if the early legs take longer to complete than later legs? In the example above, all the legs take the same amount of time. What about triangles that don’t terminate at the apex?

By way of example, the following would probably fit the definition of a triangle:

But now our array would be given by 1, 3, 5, 8, 6, 4, 2, 5, 7, 6, 4, 3, 6, 0. That is,

var Triangle[14] = {1,3,5,8,6,4,2,5,7,6,4,3,6,0};

Would these two patterns return different Frechet similarities when applied to the same price curve?

The answer is yes. But, bear in mind that a pattern corresponding to the first triangle will still be *somewhat* similar to the second triangle. In practice, this means that in order to use this approach effectively, we would need to cover our bases and check for multiple variants of the intended pattern, perhaps using some sort of confirmation between different variations of the pattern. We’ll see in the example below how much variation we see in our similarity measure for different variants of the same intended pattern.

The

Scaleparameter controls the vertical height of the pattern being searched in the price curve. This is the same as stretching the predefined pattern in the direction of price in a price chart. In most cases, it will make sense to set the height of the pattern to the range of the price action over the period of interest. We do this automatically by setting the

Scaleparameter based on the maximum and minimum values of the price series over the time period of interest via Zorro’s

MaxVal()and

MinVal()functions:

Scale = MaxVal(Data, TimeFrame) - MinVal(Data, TimeFrame);

The

TimeFrameparameter controls the pattern’s horizontal length and corresponds to the number of bars over which to apply the pattern. This is the same as stretching the predefined pattern in the direction of time on a price chart. This parameter requires a little more thought because there are no hard and fast rules regarding how long these patterns should take to form. Again, we must deal with the inherent subjectivity of the method.

Rather than constraining our pattern detection algorithm to a single time period, why not simply look over multiple time periods?

We could do this by calling

frechet()within a

for()loop that increments the

TimeFrameparameter on every iteration, like so:

for(i=5;i<100;i+=5) { frechet(Price, i, 100*PIP, Triangle); }

This will search for a pattern called “Triangle” that is 100 pips high over multiple time ranges, from 5 bars to 100 bars. In practice, we don’t really need to cover every incremental number of bars (1,2,3,4 etc) because patterns evolving over similar time horizons will tend to return similar

frechet()values. For example, a pattern evolving over 10 bars will be similar to the same pattern evolving over 11 bars. In the example above, we increment our search length by 5 bars.

We are now in a position to start detecting patterns and analyzing their usefulness as trading signals.

The code below plots the Frechet similarity metric for our symmetric and asymmetric triangles, and their inverses, over a number of time horizons. We use the

strf()function to enable us to pass a variable (in this case, the integer

i) into a string (in this case, the name of the plot) so that we can plot the different Frechet similarities from within the

for()loop:

/* PLOT FRECHET SIMILARITY */ function run() { set(PLOTNOW); StartDate = 20150712; EndDate = 20150826; BarPeriod = 1440; LookBack = 100; asset("SPX"); vars Price = series(priceClose()); static var Tri_Sym[9] = {1,8,2,7,3,6,5,4,0}; static var Tri_Asym[14] = {1,3,5,8,6,4,2,5,7,6,4,3,6,0}; int i; for(i=10;i<=30;i+=10) { plot(strf("Tri_Sym_%d", i),frechet(Price, i, MaxVal(Price,i) - MinVal(Price,i), Tri_Sym),NEW,RED); plot(strf("Tri_Asym_%d", i),frechet(Price, i, MaxVal(Price,i) - MinVal(Price,i), Tri_Asym),0,BLUE); plot(strf("Tri_Sym_Inv_%d", i),frechet(Price, i, -(MaxVal(Price,i) - MinVal(Price,i)), Tri_Sym),NEW,BLACK); plot(strf("Tri_Asym_Inv_%d", i),frechet(Price, i, -(MaxVal(Price,i) - MinVal(Price,i)), Tri_Asym),0,GREEN); } PlotWidth = 800; PlotScale = 15; PlotHeight1 = 500; PlotHeight2 = 125; }

Here we zoom into an area of the S&P500 index that saw a fairly obvious triangle develop from early July through to mid-August 2015:

You can see that most variants of our Frechet algorithm detected the triangle at some point during its evolution. In particular, the asymmetric inversive triangle measured over 20 days did a particularly good job of recognizing the triangle, reaching a similarity score of approximately 50 as the triangle approached its apex.

Looking more closely at other regions will reveal that the algorithm is far from perfect, sometimes scoring patterns that we would rather exclude relatively highly. This makes it difficult to differentiate the “true” patterns on the basis of some threshold similarity score. To overcome that, we could perhaps continue to refine our pattern definitions or implement a series of confirmations from different variations, but that would get tedious fairly quickly.

Here’s an example of a simple trading strategy that looks for our asymmetric triangle pattern across several time horizons. Again using the

strf()function, we switch to a new

Algofor each time horizon. I read somewhere that gold is particularly prone to triangle formations, so we’ll use the GLD Exchange Traded Fund. I also read that triangles are allegedly indicators of a strong breakout in any direction.

*I don’t know whether this has any basis in fact, but let’s go with it for the purpose of the exercise.*

On the basis of the alleged triangle behavior, when one is detected we bracket the market at a quarter of the 20-day ATR. We leave our pending orders for a maximum of 10 days, but no longer than the time horizon used to detect the pattern. Likewise, we close our trades after a maximum of 20 days, but no longer than the time horizon.

There’s much more you could do here, for example cancelling the remaining pending order when the opposite one is executed.2The code uses Alpha Vantage’s API for getting the required GLD historical data, so you’ll need to set this up in your Zorro.ini file if you don’t already have this data.

/* FRECHET TRADING */ var threshold = 30; function run() { set(PLOTNOW); StartDate = 2007; EndDate = 2017; BarPeriod = 1440; AssetList = "AssetsIB"; if(is(INITRUN)) assetHistory("GLD", FROM_AV); asset("GLD"); vars Price = series(priceClose()); var Tri_Asym[14] = {1,3,5,8,6,4,2,5,7,6,4,3,6,0}; int i; for(i=10;i<=50;i+=10) { algo(strf("_%d_Asym", i)); if(frechet(Price, i, MaxVal(Price,i)-MinVal(Price,i), Tri_Asym) > threshold) { Entry = 0.25*ATR(20); EntryTime = min(i, 10); LifeTime = min(i, 20); if(NumOpenLong == 0) enterLong(); if(NumOpenShort == 0) enterShort(); } } PlotWidth = 800; PlotHeight1 = 500; PlotHeight2 = 125; }

Here’s the equity curve:

You’ll find that such a trading strategy is difficult to apply directly to a universe of potential assets.

The effect of randomness, combined with the difficulty in refining pattern definitions sufficiently, invites overfitting to the parameters of the

frechet()function, not to mention selection bias. You’ll also be faced with the decision to trade a pattern detected at multiple time horizons.

Maybe a system of confirmations from numerous pattern variations would help, but perhaps a more practical application for the trader interested in patterns is to use

frechet()to scan a universe of assets and issue email or SMS alerts, or display a message listing assets where a pattern was detected for further manual analysis. Maybe we’ll cover this in a future blog post if you’re interested, let me know in the comments below.

Here are some rather idealized patterns to get you started. I leave it up to you to experiment with various departures from their idealized forms.

var rectangle[5] = {1,2,1,2,0}; var cup[10] = {6,3,2,1,1,1,2,3,6,0}; var zigzag[5] = {1,7,2,8,0}; var headShldrs[17] = {1,2,3,3,3,4,5,6,6,5,4,3,3,3,2,1,0}; var cup[10] = { 6,3,2,1,1,1,2,3,6,0 }; var triangle_symmetric[9] = {1,8,2,7,3,6,5,4,0}; var triangle_assymetric[14] = {1,3,5,8,6,4,2,5,7,6,4,3,6,0};

**Want a more robust and profitable approach to trading? Gain a broader understanding of how we use algorithms to trade systematically and make our capital grow by downloading the free Algo Basics PDF below.**

After that, check out our other blog post where we outline how we approach the markets in a way that allows us to trade for a living.

The post Pattern Recognition with the Frechet Distance appeared first on Robot Wealth.

]]>The post Can you apply factors to<br> trade performance? appeared first on Robot Wealth.

]]>For instance, maybe you wonder if your strategy tends to do better when volatility is high?

In this case, you can get very binary feedback by, say, running backtests with and without a volatility filter.

But this can mask interesting insights that might surface if the relationship could be explored in more detail.

Zorro has some neat tools that allow us to associate data of interest with particular trading decisions, and then export that data for further analysis. Here’s how it works:

Zorro implements a `TRADE`

struct for holding information related to a particular position. This struct is a data container which holds information about each trade throughout the life of our simulation. We can also add our own data to this struct via the `TRADEVAR`

array, which we can populate with values associated with a particular trade.

Zorro stores this array, along with all the other information about each and every position, as members of the `TRADE`

struct. We can access the `TRADE`

struct members in two ways: inside a trade management function (TMF) and inside a trade enumeration loop.

Here’s an example of exporting the last estimated volatility at the time a position was entered, along with the return associated with that position *(this is a simple, long only moving average cross over strategy, data is loaded from Alpha Vantage):*

/* Example of exporting data from a Zorro simulation. */ #define VOL TradeVar[0] int recordVol(var volatility) { VOL = volatility; return 16; } function run() { set(PLOTNOW); StartDate = 2007; EndDate = 2019; BarPeriod = 1440; LookBack = 200; MaxLong = MaxShort = 1; string Name; while(Name = loop("AAPL", "MSFT", "GOOGL", "IBM", "MMM", "AMZN", "CAT", "CL")) { assetHistory(Name, FROM_AV); asset(Name); Spread = Commission = Slippage = 0; vars close = series(priceClose()); vars smaFast = series(SMA(close, 10)); vars smaSlow = series(SMA(close, 50)); var vol = Moment(series(ROCP(close, 1)), 50, 2); // rolling 50-period standard deviation of returns if(crossOver(smaFast, smaSlow)) { enterLong(recordVol, vol); } else if(crossUnder(smaFast, smaSlow)) { exitLong(); } plot("volatility", vol, NEW, BLUE); } if(is(EXITRUN)) { int count = 0; char line[100]; string filename = "Log\\vol.csv"; if(file_length(filename)) { printf("\nFound existing file. Deleting."); file_delete(filename); } printf("\n writing vol file..."); sprintf(line, "Asset, EntryDate, TradeReturn, EntryVol"); file_append(filename, line); for(closed_trades) { sprintf(line, "\n%s, %i, %.6f, %.5f", Asset, ymd(TradeDate), (-2*TradeIsShort+1)*(TradePriceClose-TradePriceOpen)/TradePriceOpen, VOL); file_append(filename, line); count++; } printf("\nTrades: %i", count); } }

The general pattern for accomplishing this is:

- Define a meaningful name for the element of the
`TradeVar`

that we’ll use to hold our volatility data (line 5) - Define a Trade Management Function to expose the
`TRADE`

struct and use it to assign our variable to our`TradeVar`

(lines 7-12). A return value of 16 tells Zorro to run the TMF only when the position is entered and exited. - Calculate the variable of interest in the Zorro script. Here we calculate the rolling 50-day standard deviation of returns (line 34).
- Pass the TMF and the variable of interest to Zorro’s
`enter`

function (line 38). - In the
`EXITRUN`

(the last thing Zorro does after finishing a simulation), loop through all the positions using a trade enumeration loop and write the details, along with the volatility calculated just prior to entry, to a csv file.

Running this script results in a small csv file being written to Zorro’s Log folder. A sample of the data looks like this:

Once we’ve got that data, we can easily read it into our favourite data analysis tool for a closer look. Here, I’ll read it into R and use the `tidyverse`

libraries to dig deeper. *(This will be very cursory. You could and should go a lot deeper if this were a serious strategy.)*

First, read the data in, and process it by adding a couple of columns that might be interesting:

library(ggplot2) library(tidyverse) # analysis of gap size and trade profit path <- "C:/Zorro/Log/" file <- "vol.csv" df <- read.csv(paste0(path, file), header=TRUE, stringsAsFactors=FALSE, strip.white=TRUE) # make some additional columns df$AbsTradeReturn <- abs(df$TradeReturn) df$Result <- factor(ifelse(df$TradeReturn>0, "win", "loss"))

If we `head`

the resulting data frame, we find that it looks like this:

head(df) # Asset EntryDate TradeReturn EntryVol AbsTradeReturn Result # 1 MSFT 20190906 -0.011359 0.00019 0.011359 loss # 2 MSFT 20190904 0.017583 0.00020 0.017583 win # 3 MSFT 20190829 -0.015059 0.00020 0.015059 loss # 4 CL 20190828 -0.010946 0.00017 0.010946 loss # 5 MSFT 20190819 -0.036269 0.00017 0.036269 loss # 6 GOOGL 20190712 0.052325 0.00023 0.052325 win

*Sweet! Looks like we’re in business!*

Now we can start to answer some interesting questions. First, is volatility at the time of entry related to the magnitude of the trade return? Intuitively we’d expect this to be the case, as higher volatility implies larger price swings and therefore larger absolute trade returns:

# is volatility related to the magnitude of the trade return? ggplot(data=df[, c("AbsTradeReturn", "EntryVol")], aes(x=EntryVol, y=AbsTradeReturn)) + geom_point(alpha=0.4) + geom_smooth(method="lm", se=TRUE)

Nice! Just what we’d expect to see.

Does this relationship hold for each individual asset that we traded?

# what about by asset? ggplot(data=df[, c("AbsTradeReturn", "EntryVol", "Asset")], aes(x=EntryVol, y=AbsTradeReturn)) + geom_point(alpha=0.4) + geom_smooth(method="lm", se=TRUE) + facet_wrap(~Asset)

Looks like the relationship generally holds at the asset level, but note that we have a small sample size so take the results with a grain of salt:

# note that we have a small sample size: df %>% group_by(Asset) %>% count() # Asset n # <chr> <int> # 1 AAPL 36 # 2 AMZN 30 # 3 CAT 38 # 4 CL 47 # 5 GOOGL 36 # 6 IBM 37 # 7 MMM 32 # 8 MSFT 37

Is volatility related to the actual trade return?

# is volatility related to the actual trade return? ggplot(data=df[, c("TradeReturn", "EntryVol")], aes(x=EntryVol, y=TradeReturn)) + geom_point(alpha=0.4) + geom_smooth(method="lm", se=TRUE)

Looks like it might be. But this was a long-only strategy that made money in a period where everything went up, so I wouldn’t read too much into this without controlling for that effect.

Is there a significant difference in the entry volatility for winning and losing trades?

# what's the spread of volatility for winning and losing trades? ggplot(data=df[, c("EntryVol", "Result")], aes(x=Result, y=EntryVol)) + geom_boxplot()

Finally, we can treat our volatility variable as a “factor” to which our trade returns are exposed. Is this factor useful in predicting trade returns?

First, we’ll need some functions for bucketing our trade results by factor quantile:

# factor functions get_factor_quantiles <- function(factor_df, n_quantiles = 5, q_type = 7) { n_assets <- factor_df %>% ungroup %>% select(Asset) %>% n_distinct() factor_df %>% mutate(rank = rank(factor, ties.method='first'), quantile = get_quantiles(factor, n_quantiles, q_type)) } get_quantiles <- function(factors, n_quantiles, q_type) { cut(factors, quantile(factors, seq(0,1,1/n_quantiles), type = q_type), FALSE, TRUE) }

If we bucket our results by factor quantile, do any buckets account for significantly more profit and loss? Are there any other interesting relationships?

# if we bucket the vol, do any buckets account for more profit/loss? factor_df <- df[, c("Asset", "EntryVol", "TradeReturn")] names(factor_df)[names(factor_df) == "EntryVol"] <- "factor" quantiles <- get_factor_quantiles(factor_df) r <- quantiles %>% group_by(quantile) %>% summarise(MeanTradeReturn=mean(TradeReturn)) ggplot(data=r, aes(quantile, MeanTradeReturn)) + geom_col() + ggtitle("Returns by volatility quantile")

Looks like there might be something to that fifth quantile (but of course beware the small sample size).

We can retrieve the cutoff value for the fifth quantile by sorting our factor and finding the value four-fifths the length of the resulting vector:

# fifth bin cutoff sorted <- sort(factor_df$factor) sorted[as.integer(4/5*length(sorted))] # [1] 0.00043

There you have it. This was a simple example of exporting potentially relevant data from a Zorro simulation and reading it into a data analysis package for further research.

How might you apply this approach to more serious strategies? What data do you think is potentially relevant? Tell us your thoughts in the comments.

*Want a broader understanding of algorithmic trading? See why it’s the only sustainable approach to profiting from the markets and how you can use it to your success inside the free Algo Basics PDF below….*

The post Can you apply factors to<br> trade performance? appeared first on Robot Wealth.

]]>The post Time is NOT the Enemy:<br> Grow Your Capital by Showing Up appeared first on Robot Wealth.

]]>This assumption can lead us down long and unnecessary rabbit holes and away from the more mundane fundamentals that account for 80% of our day-to-day trading decisions.

When you run a trading business you quickly get to the meat of *practical* market theory – the 80-90% that matters. From our experience, there are **two fundamental concepts** you need to know that are absolutely vital to profiting from the markets.

These concepts are:

- The Time Value of Money
- The Principle of No-Arbitrage

This post is the first in a series of *Quant Basics* where we’ll explore these fundamentals, as well as others*. *We’ll focus solely on the Time Value of Money today as it’s the cornerstone of any profitable trading approach…. including what we do here at Robot Wealth!

So down tools on those deep neural networks for a second. Let’s look at why this first concept is so important for your success as a retail trader.

This is easily the most fundamental concept in finance — let’s break it down.

Simply put, $100 today is worth more to you than $100 received in a year’s time. *Call me captain obvious.*

If you have $100 today you can do potentially valuable things with it now. You could start a business, or you could invest it in a financial asset like a share of a company. By doing this you expect to get positive returns on your $100 for taking on this risk, so it seems like a good idea that is likely to pay off over the *long-run*.

But what if you don’t have a* long-run?*

What if you are going to need that $100 in a year’s time and you want to make sure you preserve that capital, but you also want to put it to work to make more money in the meantime?

In that case, you can lend it to someone who has a slightly longer investment horizon than you. In exchange for them having the use of your money, they will pay you a small amount of (theoretically) guaranteed interest when they give you your money back.

This is what your bank does. When you deposit money in the bank, you’re really *lending* it to the bank. They take various well-diversified risks with that money and they pay interest to their depositors for the use of that money. If they manage things appropriately they will receive a return greater than the interest they pay to depositors – producing a profit.

This simple idea that money has a time value is fundamental to *all of finance. *

What’s in all this for you?

Well, if you have money now you can use it to make more money in the future if you know what to do with it.

*(Stuffing your money under the mattress is a poor choice when someone with ideas and plans is willing to pay you for the use of that money…)*

The Time Value of Money principle gives us a baseline for our risk-taking. If we can get a certain guaranteed yield in the bank then we should only be taking on extra *unguaranteed* risk if we are confident that our expected rewards for taking that risk are greater than the baseline amount we get from the bank.

**In essence, trading is risk-taking.**

As investors, we are rewarded for taking on certain long-term risks, which include buying assets that are sensitive to disappointment in economic growth, inflation and interest rates. Here are some risks associated with various financial products:

So to put our capital to work we get long assets which are exposed to these risks. Getting long lots of them dramatically reduces portfolio volatility through diversification – *which is a good idea!*

**We primarily do this by harvesting risk premia.**

In extremely simple and general terms:

- Buying stocks is a good idea
- Buying bonds is a good idea
- Buying real estate is a good idea
- Selling a little bit of volatility is a good idea.

**By getting those in your portfolio you buy yourself the maximum chance of making money under most conditions. This also adds a portfolio “tailwind” to your active strategies.**

Crucially, this also means that you’ll always be trading one way or another, even when your more active strategies inevitably stop working….which *will* happen.

We incorporate this in our trading at Robot Wealth. Our risk premia strategy has returned around 17% at a CAGR of 26% since going live 9 months ago:

We provide this strategy to our Bootcamp participants, too

You can read more about why and how we harvest risk premia here.

The good news is that there’s no need to make this approach more complicated than it has to be — **80% of success here is just showing up. **

The precise way you execute the above risk-taking matters a lot less than the fact that *you do it at all. *If you find technical details like volatility and covariance management daunting, know that this makes up just 20%.

The 80% is just buying the right assets and keeping hold of them – which is simple to understand and implement, and will put you in a strong position when you get involved in more active, riskier trades.

Harvesting risk premia is a simple way of taking risk to earn above-baseline interest on your capital, especially as a small-time retail trader. It’s the most sensible way to use the time value of money to your advantage, before getting involved in more active strategies. You just need to show up and have a long time horizon.

*That’s Quant Basic number one.*

**In the next Quant Basics post we’ll investigate the pricing efficiency of the markets, why it’s hard to trade given this efficiency, and how you can do it anyway!**

**In the meantime, you can learn more about the fundamentals of algo trading by downloading the free PDF below:**

The post Time is NOT the Enemy:<br> Grow Your Capital by Showing Up appeared first on Robot Wealth.

]]>The post A Quant’s Approach to Drawdown:<br> The Cold Blood Index appeared first on Robot Wealth.

]]>Specifically, they’d:

- do the best job possible of designing and building their trading strategy to be robust to a range of future market conditions
- chill out and let the strategy do its thing, understanding that drawdowns are business-as-usual
- go and look for other opportunities to trade.

*Of course, at some point, you have to retire strategies. Alpha doesn’t persist forever.*

In our own trading, we don’t systematise this decision process. We weigh up the evidence and make discretionary judgements. All things being equal we tend to allow things a lot of space to work out.

However, in this post, we review a systematic approach which can aid this decision making…

In particular, we concentrate on the following question: *“*

Let’s dive in and explore *The Cold Blood Index!*

Johan Lotter, the brains behind the Zorro development platform, proposed an empirical approach to the problem of reconciling backtest returns with live returns.

Put simply, his approach compares a drawdown experienced in live trading to the backtested equity curve, and he called this approach the **Cold Blood Index (CBI)**.

Apart from sounding like something you’d use to rank your favourite reptiles, we’re going to break down the CBI and find out what it can tell you about your drawdowns in live trading — especially when panic alarms are busy going off in your lizard brain.

You can see Johan’s blog post from 2015 for the original article.

*Let’s break it down….*

Paying homage to its creator, we’ll utilise Zorro to illustrate the CBI in action.

*Don’t worry if any of these details are hard to grasp. Follow the bigger picture and revisit the finer nuances later.*

Say you have been trading a strategy live for \(t\) days and are in a drawdown of length \(l\) and depth \(D\).

You want to know how this compares with the backtest. Most people will want to use this to decide whether their strategy is ‘broken’, but remember that *backtests are far from a 100% accurate representation of future performance. *So be careful how you use this thing.

The CBI is an estimate of the probability of experiencing the current drawdown if the strategy *hasn’t* deviated from its backtest.

- A high CBI indicates that the current drawdown is
*not*unexpected, meaning the strategy probably hasn’t deviated from its backtest. - A low CBI indicates that the system that produced the backtest is very unlikely to have produced the current drawdown – meaning the live strategy
*has*deviated from the backtest.

Our *null hypothesis* (default stance) is that the live strategy’s current drawdown could have been produced by the backtested strategy.

The CBI is the *p*-value used to evaluate the statistical test of our null hypothesis.

**More simply,** **the CBI is the probability that the current drawdown would be equal to (or more extreme than) its observed value if the strategy hadn’t deviated**.

Typically, we don’t have access to the entire population of data points, but only a sample or subset of the total population. So, statistical hypothesis testing is a process that tests claims about the population we DO have, on the basis of evidence gleaned from the sample.

Since we don’t have the full population we can never be *totally sure* of any conclusion we draw (just like virtually all things in trading), but we CAN accept or reject the claims on the basis of the strength of the evidence.

The strength of the evidence is encapsulated in the ** p-value. **But we also need a statistical test for calculating it.

**In this case, our statistical test is empirical in nature – it is derived directly from the backtest equity curve. **

Deriving an empirical statistical test involves calculating the distribution of the phenomenon we are testing – in this case, the depth of drawdowns of length \(l\) within a live trading period \(t\) – from the sample data (our backtest).

Then, we compare the phenomenon’s observed value (\(D\), the depth of our current drawdown of length \(l\)) with the distribution obtained from the sample data, deriving the *p*-value directly from the observed value’s position on the sample distribution.

Naturally, let’s start with the *worst-case scenario.*

Take the simple case where our trading time, \(t\) is the same as the length of our current drawdown, \(l\).

To calculate the empirical distribution, we simply take a window of length \(l\) and place it at the first period in the backtest balance curve.

That is, the window initially covers the backtest balance curve from the first bar to bar \(l\).

Then, we simply calculate the difference in balance across the window, \(G\), and record it. Then, we slide the window by one period at a time until we reach the end of the balance curve, calculating the change in balance across each window as we go.

At the completion of this process, we have a total of \(M\) values for balance changes across windows of length \(l\) (\(M\) is equal to the length of the backtest minus \(l\) plus one). Of these \(M\) values, \(N\) will show a greater drawdown than our current drawdown, \(D\). Then, the CBI, here denoted \(P\), is simply \[P = \frac{N}{M}\]

Here’s the code for calculating the CBI and plotting the empirical distribution from the backtest, for this special case where the strategy is underwater from the first day of live trading:

/* Cold Blood Index Special case of drawdown length equal to trade time That is, strategy underwater since inception */ int TradeDays = 60; // Days since live start and in drawdown var DrawDown = 20; // Current drawdown depth in account currency string BalanceFile = "Log\\simple_portfolio.dbl"; void Histogram(string Name,var Value,var Step,int Color) /* plots a histogram given a value and bin width */ { var Bucket = floor(Value/Step); plotBar(Name,Bucket,Step*Bucket,1,SUM+BARS+LBL2,Color); } void main() { var HistStep = 10; //bin width of histogram plotBar("Live Drawdown",DrawDown/HistStep,DrawDown,80,BARS|LBL2,BLACK); //mark current drawdown in histogram // import balance curve int CurveLength = file_length(BalanceFile)/sizeof(var); var *Balances = file_content(BalanceFile); // get number of samples int M = CurveLength - TradeDays + 1; // sliding window calculations var GMin=0, N=0; //define N as a var to prevent integer truncation in calculation of P int i; for(i=0; i<M; i++) { var G = Balances[i+TradeDays-1] - Balances[i]; if(G <= -DrawDown) N += 1.; if(G < GMin) GMin = G; Histogram("G", G, HistStep, RED); } var P = N/M; printf("\nTest period: %i days",CurveLength); printf("\nWorst test drawdown: %.f",-GMin); printf("\nSamples: %i\nSamples worse than observed: %i",M,(int)N); printf("\nCold Blood Index: %.1f%%",100*P); }

To use this script, you first need to save the profit and loss time series data from a backtest (also, the backtest will need to use the same money management approach as used in live trading).

Do that by setting Zorro’s

LOGFILEand

BALANCEparameters, which automatically save the backtest’s balance curve in Zorro’s log file.

Having saved the balance curves, in the CBI script above, make sure the string

BalanceFileis set correctly (line 11).

Here’s an example. Say we had been trading our strategy live and we were concerned about the performance of the *EUR/USD: rsi* component. It’s been trading for 60 days now, and that component is showing a drawdown of $20.

Plugging those values into the script above and loading that component’s backtested balance curve gives the following histogram:

The script also outputs to the Zorro GUI window some pertinent information. Firstly, that of 1,727 sample windows, 73 were worse than our observed drawdown. CBI is then calculated as \(73/1,727 = 0.04\), which may exceed some individuals’ threshold confidence level of 0.05 (remember a smaller CBI provides stronger evidence of a “broken” strategy). But this is somewhat arbitrary.

We can also run the CBI script with various values of

TradeDaysand

DrawDownto get an idea of what sort of drawdown would induce changes in the p-value.

The implementation of CBI above is for the special case where the strategy has been experiencing a drawdown since the first day of live trading.

Of course, this (hopefully) won’t always be the case!

**For drawdowns that come after some new equity high, calculation of the empirical distribution is a little tricker. **

Why?

Because we now have to consider the total trading time \(t\) as well as the drawdown time \(l\).

Reproducing this distribution of drawdowns faithfully would require traversing the balance curve using nested windows: an outer window of length \(t\) and an inner window of length \(l\) traversing the outer window, period by period, at every step of the outer window’s journey across the curve.

That is, for a backtest of length \(y\), we now have \((y-t+1)*(t-l+1)\) windows to evaluate.

Rather than perform that cumbersome operation, we can apply the same single rolling window process that was used in the simple case, combined with some combinatorial probability to arrive at a formula for CBI, which we denote \(P\), in terms of the previously defined parameters \(M, N, T\):

\[P = 1 – \frac{(M – N)!(M – T)!}{M!(M – N – T)!}\]

The obvious problem with this equation is that it potentially requires evaluation of factorials on the order of \((10^3)!\), which is approximately \(10^{2500}\), far exceeding the maximum range of

vartype variables.

To get around that inconvenience, we can take advantage of the relationships \[ln(1000 * 999 * 998 * … * 1) = ln(1000) + ln(999) + ln(998) + … + ln(1)\] and \[e^{ln(x)} = x\] to rewrite our combinatorial equation thus:

\[P = 1 – e^{x}\] where \[x = ln(\frac{(M – N)!(M – T)!}{M!(M – N – T)!}\] \[= ln((M – N)!) + ln((M – T)!) – ln(M!) – ln((M – N – T)!)\]

Now we can deal with those factorials using a function that recursively sums the logarithms of their constituent integers, which is much more tractable.

Here’s a script that verifies the equivalence of the two approaches for small integers, as well as its output:

var logsum(int n) { if(n <= 1) return 0; else return log(n)+logsum(n-1); } int factorial(int n) { if (n <= 1) return 1; if (n >= 10) { printf("%d is too big", n); return 0; } return n*factorial(n-1); } void main() { int M = 5; int N = 3; printf("\nEvaluate x = (%d-%d)!/%d!using\nfactorial and log transforms", M, N, M); printf("\nBy evaluating factorials directly,\nx = %f", (var)factorial(M-N)/factorial(M)); printf("\nBy evaluating sum of logs,\nx = %f", exp(logsum(M-N) - logsum(M))); } /* OUTPUT: Evaluate x = (5-3)!/5!using factorial and log transforms By evaluating factorials directly, x = 0.016667 By evaluating sum of logs, x = 0.016667 */

Here’s the script for the general case of the CBI (courtesy Johan Lotter):

/* Cold Blood Index General case of the CBI where DrawDownDays != TradeDays */ int TradeDays = 100; // Days since live start int DrawDownDays = 60; // Length of drawdown var DrawDown = 20; // Current drawdown depth in account currency string BalanceFile = "Log\\simple_portfolio.dbl"; var logsum(int n) { if(n <= 1) return 0; else return log(n)+logsum(n-1); } void main() { // import balance curve int CurveLength = file_length(BalanceFile)/sizeof(var); var *Balances = file_content(BalanceFile); // calculate parameters and check sufficient length int M = CurveLength - DrawDownDays + 1; int T = TradeDays - DrawDownDays + 1; if(T < 1 || M <= T) { printf("Not enough samples!"); return; } // sliding window calculations var GMin=0, N=0; //define N as a var to prevent integer truncation in calculation of P int i = 0; for(; i < M; i++) { var G = Balances[i+DrawDownDays-1] - Balances[i]; if(G <= -DrawDown) N += 1.; if(G < GMin) GMin = G; } var P; if(TradeDays > DrawDownDays) P = 1. - exp(logsum(M-N)+logsum(M-T)-logsum(M)-logsum(M-N-T)); else P = N/M; printf("\nTest period: %i days",CurveLength); printf("\nWorst test drawdown: %.f",-GMin); printf("\nM: %i N: %i T: %i",M,(int)N,T); printf("\nCold Blood Index: %.1f%%",100*P); }

Using the same drawdown length and depth as in the simple case, but now having traded for a total of 100 days, our new CBI value is 83%, which provides* no evidence* to suggest that the component has deteriorated.

For comparing drawdowns in live trading to those in a backtest, the CBI is useful. **But we can do better by incorporating statistical resampling techniques.**

Drawdown is a function of the *sequence* of winning and losing trades. However, a backtest represents just one realization of the numerous possible winning and losing sequences that could arise from a trading system with certain returns characteristics.

As such, the CBI presented above considers just one of many possible returns sequences that could arise from a given trading system.

So, we can make the CBI more robust by incorporating the algorithm into a Monte Carlo routine, such that many unique balance curves are created by randomly sampling the backtested trade results, and running the CBI algorithm separately on each curve.

The code for this Resampled Cold Blood Index is shown below, including calculation of the 5th, 50th and 95th percentiles of the resampled CBI values.

/* Resampled Cold Blood Index General case of the Resampled CBI where DrawDownDays != TradeDays */ int TradeDays = 100; // Days since live start int DrawDownDays = 60; // Length of drawdown var DrawDown = 20; // Current drawdown depth in account currency string BalanceFile = "Log\\simple_portfolio.dbl"; var logsum(int n) { if(n <= 1) return 0; else return log(n)+logsum(n-1); } void main() { int CurveLength = file_length(BalanceFile)/sizeof(var); printf("\nCurve Length: %d", CurveLength); var *Balances = file_content(BalanceFile); var P_array[5000]; int k; for (k=0; k<5000; k++) { var randomBalances[2000]; randomize(BOOTSTRAP, randomBalances, Balances, CurveLength); int M = CurveLength - DrawDownDays + 1; int T = TradeDays - DrawDownDays + 1; if(T < 1 || M <= T) { printf("Not enough samples!"); return; } var GMin=0., N=0.; int i=0; for(; i < M; i++) { var G = randomBalances[i+DrawDownDays-1] - randomBalances[i]; if(G <= -DrawDown) N += 1.; if(G < GMin) GMin = G; } var P; if(TradeDays > DrawDownDays) P = 1. - exp(logsum(M-N)+logsum(M-T)-logsum(M)-logsum(M-N-T)); else P = N/M; // printf("\nTest period: %i days",CurveLength); // printf("\nWorst test drawdown: %.f",-GMin); // printf("\nM: %i N: %i T: %i",M,(int)N,T); // printf("\nCold Blood Index: %.1f%%",100*P); P_array[k] = P; } var fifth_perc = Percentile(P_array, k, 5); var med = Percentile(P_array, k, 50); var ninetyfifth_perc = Percentile(P_array, k, 95); printf("\n5th percentile CBI: %.1f%%",100*fifth_perc); printf("\nMedian CBI: %.1f%%",100*med); printf("\n95th percentile CBI: %.1f%%",100*ninetyfifth_perc); }

Using the same drawdown length, drawdown depth and trade time as we evaluated in the single-balance curve example, we now find that our median resampled CBI is around 98%.

It turns out that the value obtained by only evaluating the backtest balance curve was closer to the 5th percentile (that is, the lower limit) of resampled values.

While this is not significant in this example (regardless of the method used, it is clear that the component has not deteriorated) this could represent valuable information if things were more extreme.

The usefulness of the resampled CBI declines for increasing backtest length, but it is easily implemented, comes at little additional compute time (thanks in part to Lite-C’s blistering speed), and provides additional insight into strategy deterioration by considering the random nature of individual trade results.

Note however that this method would break down in the case of a strategy that exhibited significant serially correlated returns, since resampling the backtest balance curve would destroy those relationships.

As interesting as this approach is, probably not….

In nearly all cases **this is giving the right answer to the wrong question:**

“Should I pull out of this strategy bdcause its live performance looks different to the backtest?”

To labour the points we made in the first post in this series, experienced traders know that it’s a bad idea to set performance expectations based on a backtest. Manageable deviations from that backtest performance usually won’t trigger any alarm bells that inspire interference with the strategy.

Instead, we make tea and look for other trades.

To succeed in trading you have to realise and accept the randomness and efficiency of the markets — part of which means sitting through rather uncomfortable drawdowns which won’t show up in R&D. Being disappointed by live performance vs your exciting backtest is very much the norm…. you just have to take it on the chin and trust the soundness of your development process. The markets are too efficient and chaotic to care about meeting our expectations.

*We call this “Embracing the Mayhem”.*

*Embracing the Mayhem *is just one of the 7 Trading Fundamentals we teach inside our Bootcamps. These Fundamentals show the approach we use to trade successfully with Robot Wealth, which is normally only learned after years of expensive trial, error and frustration.

**But you can skip all that — you can get access to a bunch of these Fundamental videos for free by entering your email below:**

The post A Quant’s Approach to Drawdown:<br> The Cold Blood Index appeared first on Robot Wealth.

]]>