This post summarises the key lessons of the academic literature that has been published on **pairs trading. **

The key themes are highlighted at the end of the page.

## Pair Trading Literature Review

### Gatev, Goetzmann, Rouwenhorst – “Pairs Trading: Performance of a Relative Value Arbitrage Strategy”

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=141615

- This is the first meaningful academic paper on pair trading
- They use daily closing prices from 1962 to 2002 (in the original paper)
- They decide which pairs to trade in in a 12 month formation period, and then trade them for the next 6 months
- The selection criteria are to pick those with whose normalised prices (normalised to 1 at the start of the formation period) have the minimum squared distance between them
- The trading algorithm itself is similar to the one we use
- They see significant excess returns in the first part of the sample but see significant decay in recent years.

### Elliot, Hoek & Malcom – “Pairs Trading”

http://stat.wharton.upenn.edu/~steele/Courses/434/434Context/PairsTrading/PairsTradingQFin05.pdf

- The spread is modelled as an Ornstein-Uhlenbeck process
- This allows forecast time to convergence (half-life) to be estimated
- The paper doesn’t show any backtest results. It is purely theoretical.

### Vidyamurthy – Pairs Trading, Quantitative Methods and Analysis”

https://www.amazon.com/Pairs-Trading-Quantitative-Methods-Analysis/dp/0471460672

- This book contains some very good descriptions of standard econometric models.
- It introduces the “cointegration framework” which is described in many blogs including some of ours such as this one:
- The cointegration property is used to:
- identify pairs
- calculate hedge ratios
- determine trading signals

- In practice, the correlation coefficient (which is really just the slope of a regression between the two stock’s prices) tends to be quite unstable, leading us to wonder whether hanging the entire strategy on this is appropriate
- The book doesn’t provide any backtest results – but does provide some useful insights into the realities of running a pair trading operation.

### Lin, McCrae & Gulaty – “Loss protection in pairs trading through minimum profit bounds. A cointegration approach”

- The cointegration approach is used to determine a hedge ratio
- Conditions for a trade to deliver minimum profit are derived (covering trading costs)
- A backtest is performed on one pair
- Hoel (2013) subsequently tested this approach on a wider range of universes and found it lost money on the universes tested
- This may be an example of a clever theoretical approach which doesn’t agree with the messy realities of trading.

### Bowen, Hutchinson & O’Sullivan – “High-Frequency Equity Pairs Trading: Transaction Costs, Speed of Execution and Patterns in Returns”

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1611623

- Tests Gatev’s approach on high-frequency data for the UK stock market
- It backtests OK, not spectacularly.
- Returns are highly sensitive to trade timing and trading cost assumptions.

### Hoel – “Statistical Arbitrage Pairs: Can Cointegration Capture Market Neutral Profits?”

- And the answer, at least in this study, is No.
- They test the cointegration strategy between 2003 and 2013 using Lin et al’s approach
- And show consistently negative profits.

**Miao – High Frequency and Dynamic Pairs Trading Based on Statistical Arbitrage Using a Two-Stage Correlation and Cointegration Approach**

http://www.ccsenet.org/journal/index.php/ijef/article/download/33007/19708

- Study on 15 minute US stock market data
- Selects pairs in 3 month formation period by sorting on a) correlation coefficient and then b) a cointegration test
- Traded top candidates for following 1 month
- Showed very impressive performance over a (small) 12-month out-of-sample period.

**Engelberg, Gao & Jagannathan – “An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity”**

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1330689

- This paper is uniquely valuable in attempting to identify the factors that affect pair trading returns
- They find pair trading is sensitive to time to converge. Return potential decreases significantly with time after divergence, leading to the rule of thumb that you should puke trades that haven’t converged after a period of time. )
- They find profit from pair trading is related to idiosyncratic news. News affecting one of the stocks in isolation is predictive of a divergence that continues, rather than converges.
*(As would be expected.)* - They find that stocks owned by the same institutional owner don’t tend to be profitable
- They find that stocks covered by the same analyst don’t tend to be profitable.

### Do & Faff – “Does Naive Pairs Trading Still Work?”

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.547.8922&rep=rep1&type=pdf

- Replicates and extends Gatev
- Shows naive pairs trading has decayed to the point it is not profitable from 2003 onwards, at least in this study.

### Stubinger, Bredthauer – “Statistical Arbitrage Pairs Trading with High-Frequency Data”

https://dergipark.org.tr/en/download/article-file/364651

- Analyses a pair trading setup similar to Gatev on 1m high-frequency data on SPX constituents
- Uses 10 day formation period and 5 day trading period
- Looks at Euclidean Distance, Correlation Coefficient and “Fluctuation Behaviour” approaches to pair selection in the formation period
- Trades the trading period with Static trade thresholds, Varying thresholds and Reverting thresholds
- Finds very high performance, especially when trading thresholds are relatively high (+/- 2.5 standard deviation trading levels.)

## A Common Feature of the Papers

The papers that do a backtest do so on an entirely empirical and testable and reproducible basis.

There tends to be a formation period and a trading period.

- Pairs are selected in the formation period according to some factor
- Those pairs are traded in the trading period (which tends to be shorter.)
- Then you do it all again and again and again.

The huge advantage of this approach is it is completely repeatable and free of nuance.

Do real statistical arbitrage pipelines actually look like that?

Not entirely, in my experience.

There are usually two directions the problem is approached from:

- What things make sense considering their economic drivers?
- What things look good across (multiple) recent formation periods?

*Most setups look at things in both these directions, with a dollop of nuance and discretion…*

Nice Post. Perry Kaufman wrote a good book on pairs trading called “Alpha Trading”. From what I remember it uses the difference of two oscillators (Fast Stochastic) to initiate pairs trades. Check it out, its worth a read.

Thanks Dom!