Navigating Tradeoffs with Convex Optimisation
This is the final article in our recent stat arb series. The previous articles are linked below:
 A short take on stat arb trading in the real world
 A general approach for exploiting stat arb alphas
 Ideas for crypto stat arb features
 Quantifying and combining crypto alphas
 A simple and effective way to manage turnover and not get killed by costs
 How to model features as expected returns
 Building intuition for trading with convex optimisation with CVXR
So far in this series, we have:
 Introduced a general approach for doing stat arb
 Brainstormed some ideas for crypto stat arb alphas
 Explored how we might quantify and combine those alphas
 Introduced the notrade buffer: a heuristic approach to navigating the tradeoff between uncertain alpha and certain costs
 Described how to model features as expected returns – a prerequisite for using an optimisationbased approach
 Provided a ton of examples of different optimisation problems and how they work in CVXR
In the previous article, we built some intuition for what an optimiser does under the hood. In this article, we’ll use the optimiser to navigate the tradeoffs between costs, alpha and risk. Essentially we’re swapping out the simpler heuristic approach from earlier in the series for an optimisationbased approach.
There are advantages and disadvantages to doing this:
 The heuristic approach is simple and intuitive and leads to a predictable, mechanical set of rules for managing your positions. The optimiser is a bit more black box.
 The optimisation approach accommodates realworld constraints directly.
 It also enables incorporation of risk models such as covariance estimates, ValueatRisk, etc. In the heuristic approach, you’d have to incorporate this upstream of your rebalancing somehow.
 The optimisationbased approach is very flexible and scalable: you can add new signals or risk estimates without refitting anything.
You certainly don’t need to add the complexity of an optimisationbased approach. It’s not a prerequisite to making money. But if you have the resources to implement it (time, skills, etc) it can boost performance and allow you to scale fairly quickly by adding new signals or risk models as you develop them.
In this article, I’ll show you how to simulate trading with convex optimisation.
First, load our libraries and the dataset we’ve been using throughout the series:
# session options
options(repr.plot.width = 14, repr.plot.height=7, warn = 1)
library(tidyverse)
library(tidyfit)
library(glue)
library(CVXR)
library(tibbletime)
library(roll)
library(patchwork)
pacman::p_load_current_gh("RobotWealth/rsims", dependencies = TRUE)
# chart options
theme_set(theme_bw())
theme_update(text = element_text(size = 20))
# data
perps < read_csv("https://github.com/RobotWealth/rquantrecipes/raw/master/quantifyingcombiningalphas/binance_perp_daily.csv")
head(perps)
Rows: 187251 Columns: 11 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "," chr (1): ticker dbl (9): open, high, low, close, dollar_volume, num_trades, taker_buy_volum... date (1): date ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ticker  date  open  high  low  close  dollar_volume  num_trades  taker_buy_volume  taker_buy_quote_volumne  funding_rate 

<chr>  <date>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> 
BTCUSDT  20190911  10172.13  10293.11  9884.31  9991.84  85955369  10928  5169.153  52110075  3e04 
BTCUSDT  20190912  9992.18  10365.15  9934.11  10326.58  157223498  19384  11822.980  119810012  3e04 
BTCUSDT  20190913  10327.25  10450.13  10239.42  10296.57  189055129  25370  9198.551  94983470  3e04 
BTCUSDT  20190914  10294.81  10396.40  10153.51  10358.00  206031349  31494  9761.462  100482121  3e04 
BTCUSDT  20190915  10355.61  10419.97  10024.81  10306.37  211326874  27512  7418.716  76577710  3e04 
BTCUSDT  20190916  10306.79  10353.81  10115.00  10120.07  208211376  29030  7564.376  77673986  3e04 
For the purposes of this example, we’ll create the same crypto universe that we used last time – the top 30 Binance perpetual futures contracts by trailing 30day dollarvolume, with stables and wrapped tokens removed.
We’ll also calculate returns at this step for later use.
# get same universe as before  top 30 by rolling 30day dollar volume, no stables
# remove stablecoins
# list of stablecoins from defi llama
url < "https://stablecoins.llama.fi/stablecoins?includePrices=true"
response < httr::GET(url)
stables < response %>%
httr::content(as = "text", encoding = "UTF8") %>%
jsonlite::fromJSON(flatten = TRUE) %>%
pluck("peggedAssets") %>%
pull(symbol)
# sort(stables)
perps < perps %>%
filter(!ticker %in% glue::glue("{stables}USDT"))
# get the top 30 by trailing 30day volume
# exclude from universe until we have enough data to do a covariance estimate (see later)
trading_universe_size < 30
ew_cov_init_wdw < 30
universe < perps %>%
group_by(ticker) %>%
# also calculate returns for later
mutate(
total_return_simple = funding_rate + (close  lag(close, 1))/lag(close, 1),
total_return_log = log(1 + total_return_simple),
total_fwd_return_simple = dplyr::lead(funding_rate, 1) + (dplyr::lead(close, 1)  close)/close,
total_fwd_return_log = log(1 + total_fwd_return_simple),
trail_volume = roll_mean(dollar_volume, 30)
) %>%
mutate(days_since_listing = row_number()) %>% # the number of days we have data for each ticker
na.omit() %>%
ungroup() %>%
# exclude from volume ranking  won't matter if we use a rolling volume calculated with more than the covariance initialisation window as will drop these rows using na.omit
mutate(to_exclude = days_since_listing < ew_cov_init_wdw) %>%
group_by(date) %>%
mutate(
volume_rank = if_else(to_exclude, NA_integer_, row_number(trail_volume)),
is_universe = volume_rank <= trading_universe_size,
)
universe %>%
group_by(date, is_universe) %>%
summarize(count = n(), .groups = "drop") %>%
ggplot(aes(x=date, y=count, color = is_universe)) +
geom_line() +
labs(
title = 'Universe size'
)
Calculate features
Next, calculate our features as before. We have:
 Shortterm (10day) crosssectional momentum (bucketed into deciles by date)
 Shortterm (1day) crosssectional carry (also bucketed into deciles by date)
 A breakout feature defined as the number of days since the 20day high which we use as a timeseries return predictor.
# calculate features as before
rolling_days_since_high_20 < purrr::possibly(
tibbletime::rollify(
function(x) {
idx_of_high < which.max(x)
days_since_high < length(x)  idx_of_high
days_since_high
},
window = 20, na_value = NA),
otherwise = NA
)
features < universe %>%
group_by(ticker) %>%
arrange(date) %>%
mutate(
breakout = 9.5  rolling_days_since_high_20(close), # puts this feature on a scale 9.5 to +9.5
momo = close  lag(close, 10)/close,
carry = funding_rate
) %>%
ungroup() %>%
na.omit()
# create a model df on our universe with momo and carry features scaled into deciles
model_df < features %>%
filter(is_universe) %>%
group_by(date) %>%
mutate(
carry_decile = ntile(carry, 10),
momo_decile = ntile(momo, 10),
# also calculate demeaned return for everything in our universe each day for later
demeaned_return = total_return_simple  mean(total_return_simple, na.rm = TRUE),
demeaned_fwd_return = total_fwd_return_simple  mean(total_fwd_return_simple, na.rm = TRUE)
) %>%
ungroup()
# start simulation from date we first have n tickers in the universe
start_date < features %>%
group_by(date, is_universe) %>%
summarize(count = n(), .groups = "drop") %>%
filter(count >= trading_universe_size) %>%
head(1) %>%
pull(date)
Model expected returns
We’ll model our features as expected returns. See here for a detailed exploration of this topic.
# use a 90day window and refit every 10 days
is_days < 90
step_size < trading_universe_size*10
# rolling model for crosssectional features
roll_xs_coeffs_df < model_df %>%
filter(date >= start_date) %>%
regress(
demeaned_fwd_return ~ carry_decile + momo_decile,
m("lm", vcov. = "HAC"),
.cv = "sliding_index",
.cv_args = list(lookback = days(is_days), step = step_size, index = "date"),
.force_cv = TRUE,
.return_slices = TRUE
)
# rolling model for time series features
breakout_cutoff < 5.5 # below this level, we set our expected return to zero
roll_ts_coeffs_df < model_df %>%
filter(date >= start_date) %>%
# setting regression weights to zero when breakout < breakout_cutoff will give these data points zero weight in estimating coefficients
mutate(regression_weights = case_when(breakout < breakout_cutoff ~ 0, TRUE ~ 1)) %>%
regress(
total_fwd_return_simple ~ breakout,
m("lm", vcov. = "HAC"),
.weights = "regression_weights",
.cv = "sliding_index",
.cv_args = list(lookback = days(is_days), step = step_size, index = "date"),
.force_cv = TRUE,
.return_slices = TRUE
)
This results in a nested dataframe that contains the model objects and various metadata:
roll_xs_coeffs_df %>% head
roll_ts_coeffs_df %>% select(settings) %>% head
model  estimator_fct  size (MB)  grid_id  model_object  settings  slice_id 

<chr>  <chr>  <dbl>  <chr>  <list>  <list>  <chr> 
lm  stats::lm  1.360960  #0010000  <environment: 0x000002c9993fbe90>  HAC  20210211 
lm  stats::lm  1.367016  #0010000  <environment: 0x000002c9994915b8>  HAC  20210221 
lm  stats::lm  1.380032  #0010000  <environment: 0x000002c9997f4918>  HAC  20210303 
lm  stats::lm  1.384864  #0010000  <environment: 0x000002c999c1dc98>  HAC  20210313 
lm  stats::lm  1.384808  #0010000  <environment: 0x000002c999f2ada0>  HAC  20210323 
lm  stats::lm  1.384944  #0010000  <environment: 0x000002c98f421ae8>  HAC  20210402 
model  estimator_fct  size (MB)  grid_id  model_object  slice_id 

<chr>  <chr>  <dbl>  <chr>  <list>  <chr> 
lm  stats::lm  1.245552  #0010000  <environment: 0x000002c99633fe90>  20210211 
lm  stats::lm  1.257864  #0010000  <environment: 0x000002c995893588>  20210221 
lm  stats::lm  1.254896  #0010000  <environment: 0x000002c989f4bd30>  20210303 
lm  stats::lm  1.255688  #0010000  <environment: 0x000002c9891edc98>  20210313 
lm  stats::lm  1.260624  #0010000  <environment: 0x000002c988f475b8>  20210323 
lm  stats::lm  1.257784  #0010000  <environment: 0x000002c987e70f28>  20210402 
slice_id
is the date the model goes out of sample – so we’ll need to make sure that we align our model coefficients to avoid using them on the data they were fitted on.
This requires a little data wrangling:
# for this to work, need to install.packages("sandwich", "lmtest")
xs_coefs < roll_xs_coeffs_df %>%
coef()
xs_coefs_df < xs_coefs %>%
ungroup() %>%
select(term, estimate, slice_id) %>%
pivot_wider(id_cols = slice_id, names_from = term, values_from = estimate) %>%
mutate(slice_id = as_date(slice_id)) %>%
# need to lag slice id to make it oos
# slice_id_oos is the date we start using the parameters
mutate(slice_id_oos = lead(slice_id)) %>%
rename("xs_intercept" = `(Intercept)`)
ts_coefs < roll_ts_coeffs_df %>%
coef()
ts_coefs_df < ts_coefs %>%
ungroup() %>%
select(term, estimate, slice_id) %>%
pivot_wider(id_cols = slice_id, names_from = term, values_from = estimate) %>%
mutate(slice_id = as_date(slice_id)) %>%
# need to lag slice id to make it oos
# slice_id_oos is the date we start using the parameters
mutate(slice_id_oos = lead(slice_id)) %>%
rename("ts_intercept" = `(Intercept)`)
xs_coefs_df %>% head
# ts_coefs_df %>% head
slice_id  xs_intercept  carry_decile  momo_decile  slice_id_oos 

<date>  <dbl>  <dbl>  <dbl>  <date> 
20210211  0.004872208  0.001732322  0.0008239738  20210221 
20210221  0.006129608  0.001902553  0.0007640190  20210303 
20210303  0.009538180  0.002148154  0.0003897220  20210313 
20210313  0.008374302  0.001953655  0.0004141515  20210323 
20210323  0.008275046  0.002211137  0.0006898811  20210402 
20210402  0.008613423  0.002413307  0.0008298451  20210412 
Here’s a plot of our crosssectional features’ regression coefficients through time:
# plot crosssectional estimates
xs_coefs_df %>%
select(slice_id) %>%
pivot_longer(cols = slice_id_oos, names_to = "coefficient", values_to = "estimate") %>%
ggplot(aes(x = slice_id_oos, y = estimate)) +
geom_line() +
facet_wrap(~coefficient, ncol = 1, scales = "free_y") +
labs(
title = "Crosssectional features model parameters",
x = "Date",
y = "Estimate"
)
# plot timeseries estimates
# ts_coefs_df %>%
# select(slice_id) %>%
# pivot_longer(cols = slice_id_oos, names_to = "coefficient", values_to = "estimate") %>%
# ggplot(aes(x = slice_id_oos, y = estimate)) +
# geom_line() +
# facet_wrap(~coefficient, ncol = 1, scales = "free_y") +
# labs(
# title = "Timeseries features model parameters",
# x = "Date",
# y = "Estimate"
# )
The estimates for the coefficients for our carry and momentum features change over time to reflect the changing relationship with forward returns.
In particular, notice how the momentum coefficient flipped sign a few times, but especially from mid2022, which is in line with our understanding of how the feature evolved.
Plot frictionless returns to our features
Now we can plot a time series of returns to a frictionless trading strategy based on these expected return estimates. This isn’t a backtest – it makes no attempt to address realworld issues such as costs and turnover. It simply plots the returns to our predictions of expected returns over time.
I won’t actually use the linear model of the breakout feature – instead I’ll just set its expected return to 0.002 when it’s greater than 5 and 0 otherwise.
I’ll calculate target positions proportional to their crosssectional return estimates. I’ll then let the breakout feature tilt the portfolio net long, but I’ll constrain the maximum delta that this feature can add to a position.
# join and fill using slice_id to designate when the model goes oos
exp_return_df < model_df %>%
left_join(
xs_coefs_df %>% left_join(ts_coefs_df, by = c("slice_id", "slice_id_oos")),
by = join_by(closest(date > slice_id_oos)), suffix = c("_factor", "_coef")
) %>%
na.omit() %>%
# forecast crosssectional expected return as
mutate(expected_xs_return = carry_decile_factor*carry_decile_coef + momo_decile_factor*momo_decile_coef + xs_intercept ) %>%
# mean expected xs return each day is zero
# let total expected return be xs return + ts return  allows time series expected return to tilt weights
mutate(expected_ts_return = case_when(breakout_factor >= 5.5 ~ 0.002, TRUE ~ 0)) %>%
ungroup()
# longshort the xs expected return
# layer ts expected return on top
# position by expected return
# 1 in the numerator lets it get max 100% long due to breakout
max_ts_pos < 0.5/trading_universe_size
strategy_df < exp_return_df %>%
filter(date >= start_date) %>%
group_by(date) %>%
mutate(xs_position = expected_xs_return  mean(expected_xs_return, na.rm = TRUE)) %>%
# scale positions so that leverage is 1
group_by(date) %>%
mutate(xs_position = if_else(xs_position == 0, 0, xs_position/sum(abs(xs_position)))) %>%
# layer ts expected return prediction
ungroup() %>%
mutate(ts_position = sign(expected_ts_return)) %>%
# constrain maximum delta added by time series prediction
mutate(ts_position = if_else(ts_position >= 0, pmin(ts_position, max_ts_pos), pmax(ts_position, max_ts_pos))) %>%
mutate(position = xs_position + ts_position) %>%
# strategy return
mutate(strat_return = position*total_fwd_return_simple) %>%
# scale back to leverage 1
group_by(date) %>%
mutate(position = if_else(position == 0, 0, position/sum(abs(position))))
returns_plot < strategy_df %>%
group_by(date) %>%
summarise(total_ret = sum(strat_return)) %>%
ggplot(aes(x = date, y = cumsum(log(1+total_ret)))) +
geom_line() +
labs(
title = "Cumulative strategy return",
y = "Cumulative return"
)
weights_plot < strategy_df %>%
summarise(net_pos = sum(position)) %>%
ggplot(aes(x = date, y = net_pos)) +
geom_line() +
labs(
x = "Date",
y = "Net Weight"
)
returns_plot / weights_plot + plot_layout(heights = c(2,1))
Setting up the optimisation framework
There are a few things we need for our optimisation framework:
 A dataframe of prices and expected returns for all the tickers that were ever in the universe
 A risk model – we’ll use a shrunk exponentially weighted covariance matrix
 We’ll need a covariance matrix for each day of our simulation, so we’ll compute these prior to running the simulation and store them in a list
 We’ll also need a function for doing the optimisation at each time step and returning the weights for the next period
 Finally, we’ll need to figure out appropriate values for lambda and tau (the parameters that control our risk and propensity to trade). We’ll do this by testing out a few values on a sample of our data. We’ll do this using parallel processing to speed things up a bit.
Then, we can do our actual simulation.
Prices and expected returns dataframes
First, we make a dataframe of prices and expected returns for all the tickers that were ever in the tradeable universe:
# tickers that were ever in the universe
universe_tickers < features %>%
filter(is_universe) %>%
pull(ticker) %>%
unique()
# universe_tickers
strategy_df < exp_return_df %>%
filter(date >= start_date) %>%
group_by(date) %>%
mutate(expected_xs_return = expected_xs_return  mean(expected_xs_return, na.rm = TRUE)) %>%
ungroup() %>%
mutate(expected_return = expected_xs_return + expected_ts_return) %>%
select(date, ticker, expected_return) %>%
# join back onto df of prices for all tickers that were ever in the universe
# so that we have prices before and after a ticker comes into or out of the universe
# for backtesting purposes
right_join(
features %>%
filter(ticker %in% universe_tickers) %>%
select(date, ticker, close, total_fwd_return_simple, funding_rate),
by = c("date", "ticker")
) %>%
arrange(date, ticker) %>%
filter(date >= start_date)
# get same length exp returns vector for each date
exp_returns_wide < strategy_df %>%
select(ticker, date, expected_return) %>%
# will give NA where a ticker didn't exist
pivot_wider(id_cols = date, names_from = ticker, values_from = expected_return)
Risk model
Next, we need a risk model for each day in our simulation.
We’ll use an exponentially weighted covariance estimate – this puts more weight on recent data and less on the past and tends to remove the big jumps you get using a fixed window approach. See here for more details on this approach.
We’ll also shrink our covariance matrix:
 Offdiagonal elements (covariances) will be shrunk towards zero to reflect our uncertainty.
 Ondiagonal elements (variances) will be shrunk towards the average variance.
First, make our exponentially weighted covariance estimates:
# EWMA covariance estimate
# note definition of lambda in line with Risk Metrics
# ie higher values of lambda put less weight on the most recent returns and more weight on historical returns.
ewma_cov < function(x, y, lambda, initialisation_wdw = 100) {
# check that x and y are the same length and greater than initialisation_wdw
stopifnot("x and y must be of equal length" = length(x) == length(y))
if(length(x) <= initialisation_wdw) {
ewma_cov < rep(NA, length(x))
return(ewma_cov)
}
# create initialisation window and estimation window
init_x = x[1:initialisation_wdw]
init_y = y[1:initialisation_wdw]
num_obs < length(x)
# initial covariance and mean return estimates
old_cov < cov(init_x, init_y)
old_x < mean(init_x)
old_y < mean(init_y)
# preallocate output vector
ewma_cov < vector(mode = "numeric", length = num_obs)
# pad with NA for initialisation window
ewma_cov[1:initialisation_wdw] < NA
# covariance estimate
for(i in c((initialisation_wdw+1):num_obs)) {
ewma_cov[i] < lambda*old_cov + (1  lambda)*(old_x * old_y)
old_cov < ewma_cov[i]
old_x < x[i]
old_y < y[i]
}
ewma_cov
}
cov_lambda < 0.995
wdw < ew_cov_init_wdw
returns < features %>%
filter(ticker %in% universe_tickers) %>%
select(ticker, date, total_return_simple)
# long dataframe of pairwise EW covariances
ewma_covs < returns %>%
full_join(returns, by = "date") %>%
na.omit() %>%
ungroup() %>%
# get all combinations (tickers) and remove duplicate combos (eg BTCETH, ETHBTC)
mutate(tickers = ifelse(ticker.x < ticker.y, glue("{ticker.x}, {ticker.y}"), glue("{ticker.y}, {ticker.x}"))) %>%
distinct(date, tickers, .keep_all = TRUE) %>%
# calculate rolling pairwise ewma correlations
group_by(tickers) %>%
arrange(date, .by_group = TRUE) %>%
mutate(ewma_cov = ewma_cov(total_return_simple.x, total_return_simple.y, lambda = cov_lambda, initialisation_wdw = ew_cov_init_wdw)) %>%
select(date, tickers, ewma_cov, ticker.x, ticker.y) %>%
na.omit() %>%
ungroup()
tail(ewma_covs)
date  tickers  ewma_cov  ticker.x  ticker.y 

<date>  <chr>  <dbl>  <chr>  <chr> 
20240207  ZRXUSDT, ZRXUSDT  0.003337849  ZRXUSDT  ZRXUSDT 
20240208  ZRXUSDT, ZRXUSDT  0.003321164  ZRXUSDT  ZRXUSDT 
20240209  ZRXUSDT, ZRXUSDT  0.003306101  ZRXUSDT  ZRXUSDT 
20240210  ZRXUSDT, ZRXUSDT  0.003290074  ZRXUSDT  ZRXUSDT 
20240211  ZRXUSDT, ZRXUSDT  0.003274115  ZRXUSDT  ZRXUSDT 
20240212  ZRXUSDT, ZRXUSDT  0.003259633  ZRXUSDT  ZRXUSDT 
Here’s a plot of some covariance estimates for BTC and various coins:
# plot pairwise rolling covariances
ewma_covs %>%
filter(str_starts(tickers, "BTCUSDT, S")) %>%
ggplot(aes(x = date, y = ewma_cov, colour = tickers)) +
geom_line(size=0.8) +
labs(
title = "Rolling Window Covariance Estimates",
colour = "Tickers"
)
We’ll also need some functions for extracting today’s covariance matrix from our long dataframe of pairwise covariances, making sure it’s symmetrical and positive semi definite, and shrinking its values.
I’ve also included a function extend_covmat
that extends today’s covariance matrix to include all the tickers that were ever in the universe. That’s just to make the optimisation loop a little simpler – we can use consistent expected returns vectors and covariance matrixes at each iteration. For tickers not in the tradeable universe on a particular day, we set their:
 expected return to NA
 pairwise covariances to zero
 variances to the average variance of tradeable assets
We also explicitly constrain their weights to be zero in the optimisation.
# functions for wrangling covariance matrix
library(Matrix)
library(matrixcalc)
get_today_covmat < function(ewma_covs_long, today_date) {
today_cov_df < ewma_covs_long %>%
ungroup() %>%
filter(date == today_date) %>%
select(ticker.x, ticker.y, ewma_cov) %>%
pivot_wider(names_from = ticker.y, values_from = ewma_cov) %>% # , names_prefix = "cov_"
column_to_rownames(var = "ticker.x")
# check rownames and colnames match
stopifnot("error making today's cov matrix" = all.equal(rownames(today_cov_df), colnames(today_cov_df)))
today_covmat < as.matrix(today_cov_df)
today_covmat[lower.tri(today_covmat)] < t(today_covmat)[lower.tri(today_covmat)]
# return nearest positive semidefinite matrix
today_covmat < nearPD(today_covmat)
return(today_covmat$mat)
}
#' shrinks covs towards zero, reflecting uncertainty in cov estimates. may increases variances.
#' @param shrinkage_intensity: a value between 0 and 1. 0 means no shrinkage (use the original covmat), 1 means full shrinkage (use only the target_matrix)
shrink_covmat < function(covmat, shrinkage_intensity = 0.9) {
# average variance from the diagonal
average_variance < mean(diag(covmat))
# shrinkage target matrix (identity matrix scaled by average variance)
target_matrix < diag(average_variance, nrow = ncol(covmat))
# shrunk covariance matrix
covmat_shrunk < (1  shrinkage_intensity) * covmat + shrinkage_intensity * target_matrix
return(covmat_shrunk)
}
#' extend covmat to cover all tickers in the tradeable universe
#' put average variance on the diagonal, zeroes elsewhere
#' intent is that optimisation will explicitly constrain weights for assets not yet in the universe to 0
extend_covmat < function(covmat, all_tickers) {
current_tickers < colnames(covmat)
full_covmat < matrix(0, nrow = length(all_tickers), ncol = length(all_tickers),
dimnames = list(all_tickers, all_tickers))
current_tickers_ordered < intersect(all_tickers, rownames(covmat))
# ensure covmats are ordered correctly
covmat_reordered < covmat[current_tickers_ordered, current_tickers_ordered] %>%
# ensure is Matrix type for compatibility
as.matrix()
full_covmat < full_covmat[all_tickers, all_tickers]
# replace in full_covmat
full_covmat[current_tickers_ordered, current_tickers_ordered] < covmat_reordered
# replace zeros on diagonal with average variance
zero_diag_indices < which(diag(full_covmat) == 0)
average_variance < mean(diag(covmat))
diag(full_covmat)[zero_diag_indices] < average_variance
return(full_covmat)
}
wrangle_covmat < function(ewma_covs_long, date, tickers, shrinkage_intensity = 0.9) {
today_covmat < get_today_covmat(ewma_covs_long, date) %>%
shrink_covmat(shrinkage_intensity = shrinkage_intensity) %>%
extend_covmat(all_tickers = tickers)
# check
if(is.positive.semi.definite(today_covmat, tol=1e8)){
if(isSymmetric(today_covmat)) {
glue("covmat crated for {date} not symmetric")
}
} else {
glue("covmat crated for {date} not PSD")
}
return(today_covmat)
}
Attaching package: 'Matrix' The following objects are masked from 'package:tidyr': expand, pack, unpack Attaching package: 'matrixcalc' The following object is masked from 'package:CVXR': vec
Next we precompute our covariance estimates for each day in our simulation and store them in a list. This just saves a bit of time in the optimisation loop.
# precompute covariances matrixes
simulation_start_date < start_date + days(100) # allow for first insample period  otherwise we run lots of simulation loops for zero weights
exp_returns_sim_df < exp_returns_wide %>% filter(date >= simulation_start_date)
dates < exp_returns_sim_df$date
covmat_list < map(dates, ~wrangle_covmat(ewma_covs, .x, tickers = universe_tickers, shrinkage_intensity = 0.9))
Define optimisation function
Next we define a function for finding our optimal weights at each time step. I’ll use the meanvariance optimisation with costs. We explore this in more detail in this article.
Notice that we constrain the portfolio to be unleveraged (max total weight less than or equal to 1), but we don’t put a constraint on how net long or short the portfolio can be. That means that in the extreme, we could end up being fully long or fully short. Below, I’ll show you an example of constraining the net position as well.
We also pass na_idxs
to tell the optimiser which assets in our expected returns vector are not in the tradeable universe and should therefore get a weight of zero.
# define our mvo function
mvo_with_costs < function(expected_returns, current_weights, na_idxs = c(), costs, covmat, lambda = 1, tau=1) {
# define our weights vector as a CVXR::Variable
weights < Variable(length(expected_returns))
# define an alpha term as the weighted sum of expected returns
# again, express using linear algebra
alpha_term < (t(weights) %*% expected_returns)
# define a costs term. depends on:
# cost of trading  needs to be expressed such that it scales with expected returns
# calculate as elementwise cost * absolute value of weights  current_weights
# use CVXR::multiply and CVXR::abs
# absolute distance of current_weights to our weights variable
# the more our target weights differ from current weights, the more it costs to trade
# this is a decent representation of fixed percentage costs, but doesn't capture minimum commissions
# sum_entries is a CVXR function for summing the elements of a vector
costs_term < sum_entries(multiply(costs, abs(weights  current_weights))) # elementwise abs, multiply
# define a risk term as w*Sigma*w
# quad_form is a CVXR function for doing w*Sigma*w
risk_term < quad_form(weights, covmat)
# define our objective
# maximise our alpha less our risk term multiplied by some factor, lambda, less our costs term multiplied by tau
objective < Maximize(alpha_term  lambda*risk_term  tau*costs_term)
# apply our no leverage constraint
constraints < list(cvxr_norm(weights, 1) <= 1, weights[na_idxs] == 0)
# specify the problem
problem < Problem(objective, constraints)
# solve
result < solve(problem)
# return the values of the variable we solved for
result$getValue(weights)
}
Explore lambdatau parameter space
Next we need to figure out some reasonable values for lambda (our risk aversion) and tau (our propensity to trade).
If you need a refresher on lambda and tau, see the previous article on building intuition for optimisation. The TLDR is:
 Higher values of lambda will put more weight on the risk term and lead to higher diversification and eventually to reduced position sizes.
 Higher values of tau will reduce our trading by requiring a greater expected return hurdle.
We probably don’t need to run the simulation over our entire history in order to find reasonable values of lambda and tau. So we’ll just use 500 days to figure out our parameter values, then do a full simulation on that single parameter set.
We’ll loop over four values of lambda and five values of tau, for a total of 20 simulations over 500 days each. And we’ll do these in parallel to speed things up.
Here’s how to set up a parallel backend and prepare each worker environment:
library(doParallel)
library(foreach)
# register parallel backend
num_cores < detectCores()  1 # leave one core free
cl < makeCluster(num_cores)
registerDoParallel(cl)
# use clusterEvalQ to define functions and objects available to each worker
clusterExport(
cl,
varlist = c("get_today_covmat", "shrink_covmat", "extend_covmat", "wrangle_covmat", "mvo_with_costs", "universe_tickers", "exp_returns_sim_df", "covmat_list")
)
# load libraries in each worker
clusterEvalQ(cl, {
library(CVXR); library(Matrix); library(matrixcalc); library(dplyr); library(tidyr); library(glue)
})
Loading required package: foreach Attaching package: 'foreach' The following objects are masked from 'package:purrr': accumulate, when Loading required package: iterators Loading required package: parallel

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’

 ‘glue’
 ‘tidyr’
 ‘dplyr’
 ‘matrixcalc’
 ‘Matrix’
 ‘CVXR’
 ‘stats’
 ‘graphics’
 ‘grDevices’
 ‘utils’
 ‘datasets’
 ‘methods’
 ‘base’
Next, define our parameters for the simulations:
num_days < nrow(exp_returns_wide)
lambdas < c(0.1, 0.3, 1, 3)
taus < c(0.003, 0.1, 0.3, 1, 3)
date_idxs_to_do < c(1:500)
costs < 0.15/100
Finally, we can do our 500day simulations for each combination of lambda and tau in parallel. This still takes a while to complete, so we’ll save the results to disk:
# custom combine function for flattening results of foreach to one big list
# ensures that each result from the inner loop is appended to the main list, rather than nesting it further
# results in a flat structure where each element is a list corresponding to an iteration over lambda and tau without deep nesting
# requres .init arg of foreach to be list()  initialises an empty list for storing the results
flatten_lists < function(x, y) {
c(x, y)
}
results < foreach(lambda = lambdas, .combine = 'flatten_lists', .init = list()) %:% #
foreach(tau = taus, .combine = 'flatten_lists', .init = list()) %dopar% {
weights < list()
errors < list()
w0 < rep(0, length(universe_tickers))
for( i in date_idxs_to_do ) {
# today's date
d = exp_returns_sim_df$date[i]
# cov estimate
today_covmat < covmat_list[[i]]
# check cov matrix and exp returns vector are tickeraligned
# all.equal(exp_returns_sim_df %>% filter(date == d) %>% pivot_longer(date, names_to = "ticker", values_to = "expected_return")%>% arrange(match(ticker, universe_tickers)) %>% pull(ticker), colnames(today_covmat))
# get row of expected returns as a vector in the same order as the columns of the covariance matrix
# contains NA at this point
exp_rets < exp_returns_sim_df %>% filter(date == d) %>% pivot_longer(date, names_to = "ticker", values_to = "expected_return")%>% arrange(match(ticker, universe_tickers)) %>% pull(expected_return)
# get na indexes
# we'll explicitly constrain these to get zero weight
na_idxs < which(is.na(exp_rets))
# convert NA to zero
exp_rets[is.na(exp_rets)] < 0
# initialise w in case we get a solver error (can retain w0 in these cases)
w < w0
# solve for next period's weights
tryCatch({
w < mvo_with_costs(expected_returns = exp_rets, current_weights = w0, na_idxs = na_idxs, costs = costs, covmat = today_covmat, lambda = lambda, tau = tau)
}, error = function(e) {
# in case of solver error, log and retain existing weights
errors[[i]] < e
}
)
# store weights
# w is a onecolumn matrix
weights[[i]] < w
w0 < w
}
# return a list with weights, parameters, and any errors
list(list(weights = weights, lambda = lambda, tau = tau, errors = errors))
}
stopCluster(cl)
# save to disk
saveRDS(results, file = "lambda_tau_search_results_1_500.rds")
# read results from disk
# results < readRDS("lambda_tau_search_results_1_500.rds")
Wrangle simulation results
Our simulation results are a list of lists – one sublist for each lambdatau combination. Each sublist contains 500 weights vectors (one for each day in the simulation), the values of lambda and tau, and a list of solver errors, if any occurred.
We can make one big dataframe of weights for each lambdatau combination using the awesome purrr
tools:
# function for converting a day's weights vector to a dataframe
weights_mat_to_df < function(x, d, lambda, tau, tickers) {
x %>%
as_tibble(.name_repair = ~paste0("lambda_",lambda,"_tau_",tau)) %>%
mutate(date = d) %>%
mutate(ticker = tickers)
}
# make one dataframe of weights for each lambdatau combination
weights_dfs < list()
for(i in seq_along(results)) {
this_entry < results[[i]]
these_weights < this_entry$weights
this_lambda < this_entry$lambda
this_tau < this_entry$tau
weights_df < purrr::map2(
these_weights,
exp_returns_wide$date[date_idxs_to_do],
~weights_mat_to_df(.x, .y, lambda = this_lambda, tau = this_tau, tickers = universe_tickers)
) %>%
bind_rows()
weights_dfs[[i]] < weights_df
}
# combine dataframes for each lmbdatau combination into one big dataframe
all_weights < weights_dfs %>% purrr::reduce(left_join, by = c("date", "ticker")) %>%
relocate(c(date, ticker))
head(all_weights)
date  ticker  lambda_0.1_tau_0.003  lambda_0.1_tau_0.1  lambda_0.1_tau_0.3  lambda_0.1_tau_1  lambda_0.1_tau_3  lambda_0.3_tau_0.003  lambda_0.3_tau_0.1  lambda_0.3_tau_0.3  ⋯  lambda_1_tau_0.003  lambda_1_tau_0.1  lambda_1_tau_0.3  lambda_1_tau_1  lambda_1_tau_3  lambda_3_tau_0.003  lambda_3_tau_0.1  lambda_3_tau_0.3  lambda_3_tau_1  lambda_3_tau_3 

<date>  <chr>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  ⋯  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> 
20201110  BTCUSDT  0  0  0  0  0  0  0  0  ⋯  0  0  0  0  0  0  0  0  0  0 
20201110  ETHUSDT  0  0  0  0  0  0  0  0  ⋯  0  0  0  0  0  0  0  0  0  0 
20201110  BCHUSDT  0  0  0  0  0  0  0  0  ⋯  0  0  0  0  0  0  0  0  0  0 
20201110  XRPUSDT  0  0  0  0  0  0  0  0  ⋯  0  0  0  0  0  0  0  0  0  0 
20201110  EOSUSDT  0  0  0  0  0  0  0  0  ⋯  0  0  0  0  0  0  0  0  0  0 
20201110  LTCUSDT  0  0  0  0  0  0  0  0  ⋯  0  0  0  0  0  0  0  0  0  0 
Let’s look at how our different parameterisations impact the weights for BTC:
all_weights %>%
pivot_longer(cols = c(date, ticker), names_to = "parameterisation", values_to = "weight") %>%
filter(ticker == "BTCUSDT") %>%
ggplot(aes(x = date, y = weight, colour = parameterisation)) +
geom_line() +
facet_wrap(~parameterisation) +
theme(legend.position = "none") +
labs(
title = "BTC weights for various paramterisations"
)
You can see that as lambda increases, the weight we put on BTC decreases – as we increase lambda, the optimiser would start by assigning some of BTC’s weight to other tickers in order to increase diversification, and as we increase lambda further, the total portfolio exposure would decrease, further reducing the weight on BTC.
And you can see that as tau increases, there are fewer changes in those weights because we do less trading. In the extreme, we hold our existing position for far longer than we would like to.
Let’s look at how our total and net portfolio weight changes:
all_weights %>%
pivot_longer(cols = c(date, ticker), names_to = "parameterisation", values_to = "weight") %>%
group_by(parameterisation, date) %>%
summarise(total_weight = sum(abs(weight)), net_weight = (sum(weight)), .groups = "drop") %>%
pivot_longer(cols = c(total_weight, net_weight), names_to = "total_net", values_to = "weight") %>%
ggplot(aes(x = date, y = weight, colour = total_net)) +
geom_line() +
facet_wrap(~parameterisation) +
labs(
title = "Total and net portfolio weight for various paramterisations"
)
You can see that for lower values of lambda, the portfolio is mostly fully invested (blue lines). As we increase lambda, we have periods where we reduce our total portfolio exposure. This becomes more extreme as we increase lambda.
Also notice that for lower values of lambda, we can have more extreme net long or short positions (red lines). As we increase lambda, our net position tends not to fluctuate as much. We also tend not be net short as often, reflecting the longonly nature of the breakout feature, and the fact that one or two large negative expected returns will have less impact as lambda increases.
Simulate our parameterisations
Next, we’ll use rsims
to run an accurate simulation for each of our parameterisations. See this article for more information and examples of using rsims
.
We’ll assume costs of 0.15% of traded value.
backtest_df < all_weights %>%
pivot_longer(cols = c(date, ticker), names_to = "parameterisation", values_to = "weight") %>%
left_join(strategy_df, by = c("ticker", "date"))
head(backtest_df)
date  ticker  parameterisation  weight  expected_return  close  total_fwd_return_simple  funding_rate 

<date>  <chr>  <chr>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> 
20201110  BTCUSDT  lambda_0.1_tau_0.003  0  NA  15167.88  0.03288394  3e04 
20201110  BTCUSDT  lambda_0.1_tau_0.1  0  NA  15167.88  0.03288394  3e04 
20201110  BTCUSDT  lambda_0.1_tau_0.3  0  NA  15167.88  0.03288394  3e04 
20201110  BTCUSDT  lambda_0.1_tau_1  0  NA  15167.88  0.03288394  3e04 
20201110  BTCUSDT  lambda_0.1_tau_3  0  NA  15167.88  0.03288394  3e04 
20201110  BTCUSDT  lambda_0.3_tau_0.003  0  NA  15167.88  0.03288394  3e04 
params < unique(backtest_df$parameterisation)
params
 ‘lambda_0.1_tau_0.003’
 ‘lambda_0.1_tau_0.1’
 ‘lambda_0.1_tau_0.3’
 ‘lambda_0.1_tau_1’
 ‘lambda_0.1_tau_3’
 ‘lambda_0.3_tau_0.003’
 ‘lambda_0.3_tau_0.1’
 ‘lambda_0.3_tau_0.3’
 ‘lambda_0.3_tau_1’
 ‘lambda_0.3_tau_3’
 ‘lambda_1_tau_0.003’
 ‘lambda_1_tau_0.1’
 ‘lambda_1_tau_0.3’
 ‘lambda_1_tau_1’
 ‘lambda_1_tau_3’
 ‘lambda_3_tau_0.003’
 ‘lambda_3_tau_0.1’
 ‘lambda_3_tau_0.3’
 ‘lambda_3_tau_1’
 ‘lambda_3_tau_3’
# simulation for each parameterisation
# no trade buffer (trades controlled by tau parameter and are embedded in weights)
equity_curves < vector("list", length = length(params))
results_dfs < vector("list", length = length(params))
i < 1
for( p in params ) {
# get weights as a wide matrix
# note that date column will get converted to unix timestamp
backtest_weights < backtest_df %>%
dplyr::filter(parameterisation == p) %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, weight)) %>% # pivot wider guarantees prices and theo_weight are date aligned
select(date, starts_with("weight")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# NA weights should be zero
backtest_weights[is.na(backtest_weights)] < 0
# get prices as a wide matrix
# note that date column will get converted to unix timestamp
backtest_prices < backtest_df %>%
dplyr::filter(parameterisation == p) %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, weight)) %>% # pivot wider guarantees prices and theo_weight are date aligned
select(date, starts_with("close_")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# get funding as a wide matrix
# note that date column will get converted to unix timestamp
backtest_funding < backtest_df %>%
dplyr::filter(parameterisation == p) %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, funding_rate)) %>% # pivot wider guarantees prices and funding_returns_simple are date aligned
select(date, starts_with("funding_rate_")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# simulation
results_df < fixed_commission_backtest_with_funding(
prices = backtest_prices,
target_weights = backtest_weights,
funding_rates = backtest_funding,
trade_buffer = 0.,
initial_cash = 10000,
margin = 0.05,
commission_pct = 0.0015,
capitalise_profits = FALSE
) %>%
mutate(ticker = str_remove(ticker, "close_")) %>%
# remove coins we don't trade from results
drop_na(Value)
margin < results_df %>%
group_by(Date) %>%
summarise(Margin = sum(Margin, na.rm = TRUE))
cash_balance < results_df %>%
filter(ticker == "Cash") %>%
select(Date, Value) %>%
rename("Cash" = Value)
equity < cash_balance %>%
left_join(margin, by = "Date") %>%
mutate(Equity = Cash + Margin) %>%
select(Date, Equity) %>%
rename(!!p := Equity)
equity_curves[[i]] < equity
results_dfs[[i]] < results_df
i < i + 1
}
Plot equity curves for each 500day simulation:
equity_curves %>%
purrr::discard(is.null) %>%
reduce(left_join, by = "Date") %>%
pivot_longer(Date, names_to = "param", values_to = "equity") %>%
ggplot(aes(x = Date, y = equity, colour = param)) +
geom_line() +
facet_wrap(~param) +
theme(legend.position = "none") +
labs(
title = "Aftercosts equity curves for 500day simulations"
)
Eyeballing equity curves is illustrative, but let’s also quantify these results. For each 500day simulation, we’ll calculate:
 Sharpe ratio
 Total return
 Average daily turnover (as percentage of allocated cash)
initial_cash < 10000
i < 1
dfs < list()
for (r in results_dfs) {
# get lambda and tau
these_params < str_extract_all(params[i],"[09]+\\.*[09]*", simplify = TRUE)
this_lambda < these_params[1, 1]
this_tau < these_params[1, 2]
# average daily turnover
average_turnover < r %>%
filter(ticker != "Cash") %>%
group_by(Date) %>%
summarise(turnover = 100*sum(abs(TradeValue))/initial_cash, .groups = "drop") %>%
summarise(ave_turnover = mean(turnover)) %>%
pull(ave_turnover)
# total return
equity < equity_curves[[i]][, 2, drop = TRUE]
fin_eq < equity %>%
tail(1)
init_eq < equity%>%
head(1)
total_return < (fin_eq/init_eq  1) * 100
days < length(equity)
ann_return < total_return * 365/days
# sharpe
returns < na.omit(equity/lag(equity)  1)
sharpe < sqrt(365)*mean(returns)/sd(returns)
# store results
df < data.frame(
lambda = this_lambda,
tau = this_tau,
sharpe = sharpe,
total_return = total_return,
average_turnover = average_turnover
)
dfs[[i]] < df
i < i + 1
}
metrics < dfs %>% bind_rows()
metrics %>% arrange(desc(sharpe))
lambda  tau  sharpe  total_return  average_turnover 

<chr>  <chr>  <dbl>  <dbl>  <dbl> 
0.3  0.003  1.5979996  187.91581  86.563401 
1  1  1.5843898  147.20999  26.719370 
0.1  0.3  1.5632515  243.76855  62.853329 
0.3  0.1  1.5582835  184.85528  74.244957 
0.1  0.1  1.5049803  215.82454  82.633311 
0.3  0.3  1.4733322  177.88543  57.872224 
3  1  1.4360708  68.91465  23.088435 
0.1  0.003  1.4106634  187.10321  98.365393 
0.3  1  1.3676426  157.92362  28.740003 
0.1  3  1.3467732  146.92856  13.373507 
0.3  3  1.3300004  127.71079  11.963360 
1  0.3  1.2548320  98.57962  48.687742 
1  0.003  1.1851221  87.42170  71.599827 
1  0.1  1.1803581  87.72210  62.037604 
0.1  1  1.1530450  128.67032  30.273378 
1  3  1.1099375  81.99863  11.013410 
3  3  1.0846910  43.49599  8.641305 
3  0.3  1.0200999  45.43914  38.044012 
3  0.1  0.8968249  38.13561  46.235756 
3  0.003  0.8670867  36.31248  51.907027 
These metrics give you an idea of the tradeoffs:
 The parameters that resulted in our highest Sharpe turn over more than 85% of the portfolio daily. This would be hard to manage if you were trading by hand.
 You could achieve a slightly lower Sharpe but turn over much less, for example with lambda = 1, tau = 1. This would be operationally simpler.
Bear in mind though that these results only represent the first 500days of the simulation, and there’s no guarantee that they would be reflective of the results over the entire history. But they’re a good starting point for thinking about the tradeoffs.
For now, let’s pick lambda = 1 and tau = 1 and simulate over the full sample:
lambda < 1
tau < 1
costs < 0.15/100
weights < list()
errors < list()
w0 < rep(0, length(universe_tickers))
for( i in c(1:length(exp_returns_sim_df$date)) ) {
# today's date
d = exp_returns_sim_df$date[i]
# cov estimate
today_covmat < covmat_list[[i]]
# check cov matrix and exp returns vector are tickeraligned
# all.equal(exp_returns_sim_df %>% filter(date == d) %>% pivot_longer(date, names_to = "ticker", values_to = "expected_return")%>% arrange(match(ticker, universe_tickers)) %>% pull(ticker), colnames(today_covmat))
# get row of expected returns as a vector in the same order as the columns of the covariance matrix
# contains NA at this point
exp_rets < exp_returns_sim_df %>% filter(date == d) %>% pivot_longer(date, names_to = "ticker", values_to = "expected_return")%>% arrange(match(ticker, universe_tickers)) %>% pull(expected_return)
# get na indexes
# we'll explicitly constrain these to get zero weight
na_idxs < which(is.na(exp_rets))
# convert NA to zero
exp_rets[is.na(exp_rets)] < 0
# initialise w in case we get a solver error (can retain w0 in these cases)
w < w0
tryCatch({
w < mvo_with_costs(expected_returns = exp_rets, current_weights = w0, na_idxs = na_idxs, costs = costs, covmat = today_covmat, lambda = lambda, tau = tau)
}, error = function(e) {
# in case of solver error, log and retain existing weights
errors[[i]] < e
}
)
# w is a onecolumn matrix
weights[[i]] < w
w0 < w
}
results < list(weights = weights, lambda = lambda, tau = tau, errors = errors)
saveRDS(results, file = "lambda_1_tau_1.rds")
# wrangle weights into a dataframe
weights_df < purrr::map2(
weights,
exp_returns_sim_df$date,
~weights_mat_to_df(.x, .y, lambda = lambda, tau = tau, tickers = universe_tickers)
) %>%
bind_rows() %>%
rename("weight" = lambda_1_tau_1)
head(weights_df)
weight  date  ticker 

<dbl>  <date>  <chr> 
0  20210218  BTCUSDT 
0  20210218  ETHUSDT 
0  20210218  BCHUSDT 
0  20210218  XRPUSDT 
0  20210218  EOSUSDT 
0  20210218  LTCUSDT 
# join weights onto prices
backtest_df < weights_df %>%
left_join(strategy_df, by = c("ticker", "date"))
head(backtest_df)
weight  date  ticker  expected_return  close  total_fwd_return_simple  funding_rate 

<dbl>  <date>  <chr>  <dbl>  <dbl>  <dbl>  <dbl> 
0  20210218  BTCUSDT  NA  51663.6400  0.041106415  0.00412394 
0  20210218  ETHUSDT  NA  1910.9900  0.007813203  0.00516012 
0  20210218  BCHUSDT  NA  702.5500  0.015704896  0.00711178 
0  20210218  XRPUSDT  NA  0.5302  0.030592698  0.00688510 
0  20210218  EOSUSDT  NA  4.8000  0.077784750  0.00550166 
0  20210218  LTCUSDT  NA  225.8000  0.020896880  0.00616800 
# simulate
# get weights as a wide matrix
backtest_weights < backtest_df %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, weight)) %>% # pivot wider guarantees prices and theo_weight are date aligned
select(date, starts_with("weight")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# NA weights should be zero
backtest_weights[is.na(backtest_weights)] < 0
# get prices as a wide matrix
backtest_prices < backtest_df %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, weight)) %>% # pivot wider guarantees prices and theo_weight are date aligned
select(date, starts_with("close_")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# get funding as a wide matrix
backtest_funding < backtest_df %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, funding_rate)) %>% # pivot wider guarantees prices and funding_returns_simple are date aligned
select(date, starts_with("funding_rate_")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# simulation
results_df < fixed_commission_backtest_with_funding(
prices = backtest_prices,
target_weights = backtest_weights,
funding_rates = backtest_funding,
trade_buffer = 0.,
initial_cash = 10000,
margin = 0.05,
commission_pct = 0.0015,
capitalise_profits = FALSE
) %>%
mutate(ticker = str_remove(ticker, "close_")) %>%
# remove coins we don't trade from results
drop_na(Value)
margin < results_df %>%
group_by(Date) %>%
summarise(Margin = sum(Margin, na.rm = TRUE))
cash_balance < results_df %>%
filter(ticker == "Cash") %>%
select(Date, Value) %>%
rename("Cash" = Value)
equity < cash_balance %>%
left_join(margin, by = "Date") %>%
mutate(Equity = Cash + Margin) %>%
select(Date, Equity)
Here are some plots showing the aftercost performance:
# plot results
equity_plot < equity %>%
ggplot(aes(x = Date, y = Equity)) +
geom_line() +
labs(
title = "Simulation of carry, momentum & breakout factors with lambda 1, tau 1",
y = "Equity, $"
)
weights_plot < backtest_df %>%
group_by(date) %>%
summarise(
total_weight = sum(abs(weight)),
net_weight = sum(weight)
) %>%
pivot_longer(date, names_to = "type", values_to = "weight") %>%
ggplot(aes(x = date, y = weight, colour = type)) +
geom_line() +
labs(
title = "Total and Net Weight",
y = "Weight",
colour = ""
) +
theme(legend.position = "bottom")
turnover_plot < results_df %>%
filter(ticker != "Cash") %>%
group_by(Date) %>%
summarise(Turnover = 100*sum(abs(TradeValue))/initial_cash) %>%
ggplot(aes(x = Date, y = Turnover)) +
geom_line() +
geom_point() +
labs(
title = "Turnover as % of trading capital",
y = "Turnover, %"
)
options(repr.plot.width = 14, repr.plot.height=12)
equity_plot / weights_plot / turnover_plot + plot_layout(heights = c(2, 1, 1))
options(repr.plot.width = 14, repr.plot.height=7)
Notice that we get quite long from time to time. And sometimes we’re not quite fully invested. Maybe that’s what you want, maybe not.
Let’s repeat the process, but this time constrain the portfolio to a maximum net weight of +/ 0.5:
# define our mvo function
mvo_with_costs_constrained < function(expected_returns, current_weights, na_idxs = c(), costs, covmat, lambda = 1, tau = 1) {
# define our weights vector as a CVXR::Variable
weights < Variable(length(expected_returns))
# define an alpha term as the weighted sum of expected returns
# again, express using linear algebra
alpha_term < (t(weights) %*% expected_returns)
# define a costs term. depends on:
# cost of trading  needs to be expressed such that it scales with expected returns
# calculate as elementwise cost * absolute value of weights  current_weights
# use CVXR::multiply and CVXR::abs
# absolute distance of current_weights to our weights variable
# the more our target weights differ from current weights, the more it costs to trade
# this is a decent representation of fixed percentage costs, but doesn't capture minimum commissions
# sum_entries is a CVXR function for summing the elements of a vector
costs_term < sum_entries(multiply(costs, abs(weights  current_weights))) # elementwise abs, multiply
# define a risk term as w*Sigma*w
# quad_form is a CVXR function for doing w*Sigma*w
risk_term < quad_form(weights, covmat)
# define our objective
# maximise our alpha less our risk term multiplied by some factor, lambda, less our costs term multiplied by tau
objective < Maximize(alpha_term  lambda*risk_term  tau*costs_term)
# constrain to a maximum net weight +/ 0.5, maximum leverage 1
constraints < list(cvxr_norm(weights, 1) <= 1, abs(sum_entries(weights)) <= 0.5, weights[na_idxs] == 0)
# specify the problem
problem < Problem(objective, constraints)
# solve
result < solve(problem)
# return the values of the variable we solved for
result$getValue(weights)
}
lambda < 1
tau < 1
weights < list()
errors < list()
w0 < rep(0, length(universe_tickers))
for( i in c(1:length(exp_returns_sim_df$date)) ) {
# today's date
d = exp_returns_sim_df$date[i]
# cov estimate
today_covmat < covmat_list[[i]]
# check cov matrix and exp returns vector are tickeraligned
# all.equal(exp_returns_sim_df %>% filter(date == d) %>% pivot_longer(date, names_to = "ticker", values_to = "expected_return")%>% arrange(match(ticker, universe_tickers)) %>% pull(ticker), colnames(today_covmat))
# get row of expected returns as a vector in the same order as the columns of the covariance matrix
# contains NA at this point
exp_rets < exp_returns_sim_df %>% filter(date == d) %>% pivot_longer(date, names_to = "ticker", values_to = "expected_return")%>% arrange(match(ticker, universe_tickers)) %>% pull(expected_return)
# get na indexes
# we'll explicitly constrain these to get zero weight
na_idxs < which(is.na(exp_rets))
# convert NA to zero
exp_rets[is.na(exp_rets)] < 0
# initialise w in case we get a solver error (can retain w0 in these cases)
w < w0
tryCatch({
w < mvo_with_costs_constrained(expected_returns = exp_rets, current_weights = w0, na_idxs = na_idxs, costs = costs, covmat = today_covmat, lambda = lambda, tau = tau)
}, error = function(e) {
# in case of solver error, log and retain existing weights
errors[[i]] < e
}
)
# w is a onecolumn matrix
weights[[i]] < w
w0 < w
}
saveRDS(results, file = "lambda_1_tau_1_net_weight_constrained.rds")
# wrangle weights into a dataframe
weights_df < purrr::map2(
weights,
exp_returns_sim_df$date,
~weights_mat_to_df(.x, .y, lambda = lambda, tau = tau, tickers = universe_tickers)
) %>%
bind_rows() %>%
rename("weight" = lambda_1_tau_1)
# join weights onto prices
backtest_df < weights_df %>%
left_join(strategy_df, by = c("ticker", "date"))
head(backtest_df)
weight  date  ticker  expected_return  close  total_fwd_return_simple  funding_rate 

<dbl>  <date>  <chr>  <dbl>  <dbl>  <dbl>  <dbl> 
0  20210218  BTCUSDT  NA  51663.6400  0.041106415  0.00412394 
0  20210218  ETHUSDT  NA  1910.9900  0.007813203  0.00516012 
0  20210218  BCHUSDT  NA  702.5500  0.015704896  0.00711178 
0  20210218  XRPUSDT  NA  0.5302  0.030592698  0.00688510 
0  20210218  EOSUSDT  NA  4.8000  0.077784750  0.00550166 
0  20210218  LTCUSDT  NA  225.8000  0.020896880  0.00616800 
# simulate
# get weights as a wide matrix
backtest_weights < backtest_df %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, weight)) %>% # pivot wider guarantees prices and theo_weight are date aligned
select(date, starts_with("weight")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# NA weights should be zero
backtest_weights[is.na(backtest_weights)] < 0
# get prices as a wide matrix
backtest_prices < backtest_df %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, weight)) %>% # pivot wider guarantees prices and theo_weight are date aligned
select(date, starts_with("close_")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# get funding as a wide matrix
backtest_funding < backtest_df %>%
pivot_wider(id_cols = date, names_from = ticker, values_from = c(close, funding_rate)) %>% # pivot wider guarantees prices and funding_returns_simple are date aligned
select(date, starts_with("funding_rate_")) %>%
mutate(date = as.numeric(date)) %>%
data.matrix()
# simulation
results_df < fixed_commission_backtest_with_funding(
prices = backtest_prices,
target_weights = backtest_weights,
funding_rates = backtest_funding,
trade_buffer = 0.,
initial_cash = 10000,
margin = 0.05,
commission_pct = 0.0015,
capitalise_profits = FALSE
) %>%
mutate(ticker = str_remove(ticker, "close_")) %>%
# remove coins we don't trade from results
drop_na(Value)
margin < results_df %>%
group_by(Date) %>%
summarise(Margin = sum(Margin, na.rm = TRUE))
cash_balance < results_df %>%
filter(ticker == "Cash") %>%
select(Date, Value) %>%
rename("Cash" = Value)
equity < cash_balance %>%
left_join(margin, by = "Date") %>%
mutate(Equity = Cash + Margin) %>%
select(Date, Equity)
# plot results
equity_plot < equity %>%
ggplot(aes(x = Date, y = Equity)) +
geom_line() +
labs(
title = "Simulation of carry, momentum & breakout factors with lambda 1, tau 1",
subtitle = "Portfolio net position constrained to +/ 0.5",
y = "Equity, $"
)
weights_plot < backtest_df %>%
group_by(date) %>%
summarise(
total_weight = sum(abs(weight)),
net_weight = sum(weight)
) %>%
pivot_longer(date, names_to = "type", values_to = "weight") %>%
ggplot(aes(x = date, y = weight, colour = type)) +
geom_line() +
labs(
title = "Total and Net Weight",
y = "Weight",
colour = ""
) +
theme(legend.position = "bottom")
turnover_plot < results_df %>%
filter(ticker != "Cash") %>%
group_by(Date) %>%
summarise(Turnover = 100*sum(abs(TradeValue))/initial_cash) %>%
ggplot(aes(x = Date, y = Turnover)) +
geom_line() +
geom_point() +
labs(
title = "Turnover as % of trading capital",
y = "Turnover, %"
)
options(repr.plot.width = 14, repr.plot.height=12)
equity_plot / weights_plot / turnover_plot + plot_layout(heights = c(2, 1, 1))
options(repr.plot.width = 14, repr.plot.height=7)
You can see that now the portfolio never gets more than 50% net long. You can adjust this constraint further, and add any of your own, if you like.
Conclusions
In this article, we demonstrated how to frame the trading problem – that is, how to navigate the tradeoffs between edge, costs, and constraints – as a mathematical optimisation problem, and how to simulate trading the output of the optimisation.
We showed how you can control the optimiser’s decisions with the lambda (risk aversion) and tau (propensity to trade) parameters. We saw how different parameter values affect the optimiser’s trading decisions.
It’s worth reiterating that optimisation doesn’t represent alpha. Your alpha is embedded in your forecasts of expected returns. Optimisation is a tool for navigating the tradeoffs involved with harnessing your expected returns, but it doesn’t provide any alpha itself. This implies that the real work is in finding, quantifying and understanding your signals.
If using a risk model in your optimisation, like we did here with our covariance matrixes, it’s important to note that the optimiser can be very sensitive to your covariance estimates. It’s reasonable to shrink these estimates as a reflection of the inherent uncertainty.
How does the optimisation approach stack up against the heuristic notrade buffer that we explored earlier? The heuristic is simpler to reason about and implement. But it requires a bit more fudging upstream, particularly if you wanted to incorporate a risk model.
The optimisation approach is fiddly to set up at the outset, but extremely scalable once set up. It requires some skill in modelling expected returns and covariances.
1 thought on “Navigating Tradeoffs with Convex Optimisation”