One of the things I’ve noticed from staring at the screen all day for the last few months is that most of the large negative returns in US stock indexes have come overnight.
What do you mean by “overnight”?
The core stock trading session for US stocks is between 9:30 am and 4 pm Eastern Time.
That’s when most stock market transactions take place. When we look at daily OHLC (Open High Low Close) stock data, the open price is the first trade of the core 9:30 am session, and the close price is the price of the auction at the end of the 4 pm core trading session.
However, stocks also trade in the “pre-open” or “early trading session” which starts at 6:30 am and in the “late trading session” which goes until 8 pm. Futures on stock indexes also trade most of the day.
I’m interested to see how overnight returns (the jump from the close to the open) differ from intraday returns – and how that relationship may have changed recently.
Intuitively, we’d probably expect to see higher average returns overnight when the market is closed – because it’s much more difficult to hedge and manage our exposures when the cash market is closed, so we might expect to get paid a premium, on average, for taking that risk.
Let’s have a look…
Getting Data
First, we need some daily OHLC data. (Open, High, Low, Close).
Let’s use the SPY ETF, which is an exchange-traded fund which tracks the S&P 500 index.
If you want daily price data and don’t need to pull too much data at a time, then there are a number of free online sources for this, including:
- AlphaVantage
- Tiingo
We can use the alphavantager
and riingo
packages to pull data from Alpha Vantage and Tiingo respectively.
Let’s use alphavantage. Go here and sign up for a free API key.
We’re going to use alphavantager
to pull daily adjusted time series data for SPY and hold it in an R data frame called SPY.
To make this work you’ll need to tell alphavantager
about your API key running the command: av_api_key(MY_API_KEY)
. You’ll need to replace AV_API_KEY
with your actual API key.
av_api_key(MY_API_KEY) SPY <- av_get(symbol = 'SPY', av_fun = 'TIME_SERIES_DAILY_ADJUSTED', outputsize = 'full')
Here’s what our data looks like:
SPY %>% head(20) %>% kable() %>% kable_styling(full_width = FALSE, position = 'center') %>% scroll_box(width = '800px', height = '300px')
timestamp | open | high | low | close | adjusted_close | volume | dividend_amount | split_coefficient |
---|---|---|---|---|---|---|---|---|
2000-04-10 | 151.7500 | 153.1093 | 150.3125 | 150.8437 | 103.3035 | 9624200 | 0 | 1 |
2000-04-11 | 150.0000 | 151.6250 | 148.3750 | 150.4062 | 103.0038 | 9006400 | 0 | 1 |
2000-04-12 | 150.3750 | 151.1562 | 146.1562 | 146.2812 | 100.1789 | 10779200 | 0 | 1 |
2000-04-13 | 147.4687 | 148.1562 | 143.7812 | 144.2500 | 98.7878 | 12225800 | 0 | 1 |
2000-04-14 | 142.6250 | 142.8125 | 133.5000 | 136.0000 | 93.1379 | 29604000 | 0 | 1 |
2000-04-17 | 135.1875 | 140.7500 | 134.6875 | 140.7500 | 96.3909 | 23918200 | 0 | 1 |
2000-04-18 | 140.5625 | 144.4687 | 139.7812 | 144.4687 | 98.9376 | 11069200 | 0 | 1 |
2000-04-19 | 144.5000 | 145.1250 | 142.5312 | 143.1250 | 98.0174 | 6553700 | 0 | 1 |
2000-04-20 | 143.5625 | 143.9375 | 142.3750 | 143.8125 | 98.4882 | 8537600 | 0 | 1 |
2000-04-24 | 141.5000 | 143.3125 | 140.5000 | 142.2500 | 97.4182 | 12893100 | 0 | 1 |
2000-04-25 | 144.6250 | 148.1562 | 144.4375 | 148.1562 | 101.4630 | 14102000 | 0 | 1 |
2000-04-26 | 147.9687 | 148.7500 | 146.0000 | 146.4843 | 100.3180 | 7711100 | 0 | 1 |
2000-04-27 | 143.0000 | 147.3437 | 143.0000 | 146.0000 | 99.9863 | 15595300 | 0 | 1 |
2000-04-28 | 147.0000 | 147.8593 | 145.0625 | 145.0937 | 99.3656 | 8743400 | 0 | 1 |
2000-05-01 | 146.5625 | 148.4843 | 145.8437 | 147.0625 | 100.7139 | 7328300 | 0 | 1 |
2000-05-02 | 145.5000 | 147.1250 | 144.1250 | 144.1250 | 98.7022 | 9411900 | 0 | 1 |
2000-05-03 | 144.0000 | 144.0000 | 139.7812 | 140.7500 | 96.3909 | 12630700 | 0 | 1 |
2000-05-04 | 142.0000 | 142.3593 | 140.7500 | 141.8125 | 97.1185 | 5963600 | 0 | 1 |
2000-05-05 | 141.0625 | 144.0000 | 140.9375 | 143.5312 | 98.2956 | 7862400 | 0 | 1 |
2000-05-08 | 142.7500 | 143.3750 | 141.8437 | 142.4531 | 97.5572 | 5064100 | 0 | 1 |
Calculate overnight and intraday returns
Now we calculate:
- overnight returns as the % difference between the close price and the previous open
- intraday returns as the % difference between the open and the close.
(I’ve also calculated close to close returns, which don’t get used in this analysis.)
SPY_returns <- SPY %>% mutate(adjfactor = adjusted_close / close) %>% mutate(open = adjfactor * (open - close) + adjusted_close, high = adjfactor * (high - close) + adjusted_close, low = adjfactor * (low - close) + adjusted_close, close = close * adjfactor) %>% mutate(c2c = close / lag(close) - 1) %>% mutate(intraday = close/open - 1) %>% mutate(overnight = lead(open)/close - 1) %>% mutate(overnightpremium = overnight - intraday) %>% na.omit()
Cumulative overnight and intraday returns
Now let’s plot the cumulative performance of two strategies:
- “intraday” goes long at the open, and holds until the end of the day, and is flat overnight
- “overnight” goes long at the close, holds overnight, and closes on the open the next day
SPY_returns %>% pivot_longer(c(intraday, overnight), names_to = 'period', values_to = 'returns') %>% group_by(period) %>% mutate(cumreturns = cumprod(1+returns)) %>% ggplot(aes(x=timestamp, y=cumreturns, color=period)) + geom_line() + ggtitle('Intraday and Overnight Cumulative SPY Returns')
What do we see?
We see that most of our returns over the full cycle come from holding stock exposure overnight (the green line). In fact, the total return of SPY intraday since 2000 has actually been negative. This quite remarkable.
We also see that most of the recent losses in the last two weeks have come overnight.
And that most of the recent gains in the last few weeks have been intraday.
Rolling average difference in intraday and overnight returns
Let’s plot the rolling difference between overnight and intraday returns.
We use the slide_db
l function from the slider
package to achieve this.
SPY_returns %>% mutate(diff = intraday - overnight) %>% mutate(diff20 = slide_dbl(diff, mean, .before = 20, .complete = TRUE)) %>% na.omit() %>% ggplot(aes(x=timestamp, y=diff20)) + geom_line() + ggtitle('20 day moving average of difference between SPX intraday and overnight returns')
You can see how historically anomalous the recent behaviour has been.
Would we bet on that continuing for a while? I probably wouldn’t. I’d just expect this behaviour to revert to normal.
Maybe we can get a little more insight by looking at the 5-day average since 2019:
SPY_returns %>% mutate(diff = intraday - overnight) %>% mutate(diff5 = slide_dbl(diff, mean, .before = 5, .complete = TRUE)) %>% filter(timestamp >= '2019-01-01') %>% na.omit() %>% ggplot(aes(x=timestamp, y=diff5)) + geom_line() + ggtitle('5 day moving average of difference between SPX intraday and overnight returns')
And that’s exactly what we seem to see. When viewed over a shorter window length, the average difference in overnight and intraday returns does seem to be reverting to its mean.
The overnight drift
This paper looks at the returns from equity index futures and suggests that nearly 100% of those returns have come in one hour between 2 am and 3 am.
This is an insane result, and something well worth looking into…
The overnight drift paper looks very interesting. I wonder if other markets in Asia have these 2 am to 3 am (eastern time) effects too.
is an insightful article…thanks
This is awesome, I just did see it and loved it.
So I started to dive into it, with all the skills you did teach on the FX intrady bootcamp.
Thank you James.
Nice one! Great to hear you’re getting stuck in, Jan. Glad to hear the Bootcamp was helpful too.
Hello James, I’ve written a couple of posts with similar considerations too! It’s really huge the fall of the overnight returns in last March! http://www.nightlypatterns.blog
Hi Marco, I just saw this page and was going to tell you about it….but I see you beat me to it! I’ve seen a few sites that attempt to apply your pattern trading concepts here, some rather successfully, but they cost a lot of money.
Nice on Marco! Thanks for sharing your blog. Really interesting work.
Your analysis is wrong because you need to use adjusted prices.
For example, if you run your program on GASL, you will see a dramatic divergence of two lines, but it’s actually not the case.
If adjusted prices were used, wouldn’t adjusted open prices need to be used as well as adjusted close prices? If that is the case, the adjusted open and adjusted close prices would be adjusted using the same factor, so the analysis should persist.
Yeah, that’s right… This analysis does actually use dividend-adjusted prices – but you see this effect whether you account for dividends or not. In fact, not adjusting for dividends would have the impact of actually reducing overnight returns.
Very nice, found this trade can be almmost perfectly excuted, with exception of corona crash.
There seems to be a broad effect there, but there is signals shortly before market close, between 6 pm and 9 pm CET, which indicates tthe most lukrative nights.