What’s the chance that a market effect is real? Monte Carlo permutation tests in Excel

Let’s say you observe some effect in the market and quantify it with simple data analysis.

A good question is, “What are the chances I’d see this effect solely due to chance?”

And using simple Excel tools, we can answer this question without doing any formal statistics.

Before we get into it, it’s worth noting that while answering this can give you some insight into the effect, it relies on you not torturing your data too much.

Imagine you curve fit the hell out of something. By design, you’ve zeroed in on something that is genuinely unusual, but it’s overwhelmingly due to chance because your research process was lacking.

So you want to be careful in your analysis:

  • Keeping things simple will prevent you from doing too much damage.
  • Be honest with yourself about things like cherry-picking a result that fits your hypothesis.  

Even if you find that the effect likely isn’t due to chance, there’s no guarantee that you’ll make money trading it:

  • Maybe it’s a real effect that won’t overcome trading costs.
  • Maybe it’s a real effect that did overcome trading costs in the past, but now everyone knows about it, and it’s decayed.

I’m not trying to dissuade you from using this approach – it’s genuinely useful in understanding your edge – but there’s a broader context to consider.

Day of the week seasonality in crypto

Let’s see how this works for a simple day-of-the-week seasonality effect in crypto.

Here’s the process in a nutshell:

  1. Observe an interesting effect in the market.
    • In this case, we observe that BTC has historically shown anomalously high returns on Tuesdays.
  2. Understand where the actual effect sits within the context of what could happen by chance using simulation:
    • Take the returns and jumble them up randomly – typically, you use a random number generator to do this. This preserves the statistical properties of your data but destroys any seasonal effect.
    • Take the jumbled returns, group these by their randomly assigned day of the week, and aggregate. Record the results.
    • Repeat a few thousand times.
    • Compare the effect to the distribution of random results.

The goal here is to see if any single day’s returns are especially high compared to other days—and then check, by randomisation, how likely it is that we’d observe that effect by chance.

Setting Up Your Excel Workbook

Grab the complete workbook here. It’s quite large (300 MB) so might take a few seconds to download.

I’ll explain how it works.

The first tab, called “data”, contains raw price data for BTC and some calculated fields: Log_Return, Day_of_Week, and Year.

Day of the Week Seasonality

The second tab, called “analysis”, contains a pivot table that calculates the mean return to each day of the week.

Notice that Tuesday (day 2) shows an anomalously high return compared to the other days of the week:

Could this be a real market effect, or is it just a random artefact that is as likely to disappear as continue?  

One thing that would really help answer that question is having a reason for the effect to exist. Is there anything special about Tuesday that would drive outsized BTC returns?

Maybe there is, but I can’t think of anything. So I’m pretty sceptical from the outset. But I’m also open-minded. And crypto is weird.

Consistency through Time

Another thing we can do is to see how consistent the effect is over time. If it’s relatively consistent, I’d have more confidence.

The tab named “yearly” shows another pivot table.

This time, we group our data by day of the week and year and end up with a plot that shows the average return to each day of the week in each year from 2019 through 2025 (noting 2025 contains almost no data).

Interesting. The effet was concentrated in 2020 and 2022 (even though the Tuesday bar for 2022 is low in absolute terms, it’s higher than all the other 2022 bars).

Tuesday’s return was also positive in 2019, 2023 and 2024 – but not uniquely so. It was negative in 2021 only.

This isn’t exactly a slam dunk (nothing ever is). But at this point, I’m still interested enough to ask “What are the chances that we see is due to chance?

We’ll do that next.

Doing statistics the Easy Way

Answering this question requires generating a number of randomised but representative observations that we can compare our actual observations against – a few thousand would be great, but we’ll stick to a few hundred here to make things easy.

We create randomised realisations of our actual returns data – aka a Monte Carlo simulation or a permutation test. It preserves the aggregate statistical properties but removes any time-series effects, allowing us to see how often the seasonal patterns we observe in the actual data arise due to chance.

Now, admittedly, if you know some Python or R, this is easier than doing it in Excel (check out the R version here). But spreadsheets are handy and nearly everyone can drive one.

Here’s how I performed the Monte Carlo test in Excel:

On the tab named “MC_returns” I created 500 realisations of randomised BTC returns. In the first cell of the Series_1 column (cell C2), we enter the following formula:

=SORTBY(data!$D$2:$D$1977, RANDARRAY(ROWS(data!D2:$D$1977)))

This tells Excel to sort the BTC returns data (referenced by data!$D$2:$D$1977 in the function above) by an array of randomly generated numbers the same length as the returns data.

This creates one column of randomised returns.

We can then copy that first cell (no need to copy the entire column) across the first row of each series – and just like that, we have 500 randomly shuffled realisations of our BTC returns.

You end up with this:

It’s a good idea to select all of that random data and do an in-place Copy/Paste Special (Values) operation. This will replace the formulas with the data that was just generated. Without this, the random data will be refreshed every time you change a cell anywhere in the notebook, which will quickly get annoying.

It’s also a good idea to check that this all worked out as expected.

Since we’re just shuffling our data, the log returns should all add up to the same terminal cumulative return – they’ll simply take different paths to get there.

So we can calculate the sum of our randomised returns and plot a few to make sure they look as expected. This is done on the sheet named MC_cumulative_returns.

Plotting a few cumulative returns series looks as expected:

An interesting thing here is that, while the cumulative return plots are all up and to the right, you don’t see much evidence of the trendiness that we observe in the actual BTC price series.

The up moves in the actual data tend to be clustered together, but in the randomised data, they don’t.

Without doing any actual analysis, that’s suggestive that the trendiness we do observe is unlikely due to randomness.

Now that we have 500 realisations of randomised data, let’s look at how our actual Tuesday return compares.

Let’s consider the chances of any day of the week having a return as large as our Tuesday return. Since we have no good reason to think that Tuesday is special, this feels like a more relevant question.

To do this, we plot a histogram of the randomised day-of-week returns, and note where our actual Tuesday return lies on this distribution.

This answers the question “what is the chance of the return we observed happening on any day due to chance?”

This is done on the sheet named MC_analysis. We get the following histogram:

The histogram shows the frequency with which certain average day-of-the-week returns appear in our randomised data. We want to know where our actual Tuesday average return sits on this histogram.

I don’t love Excel’s default histogram. The x-axis is hard to decipher, and it’s non-trivial to add a line that denotes where our actual Tuesday return lies. We’ll come to a better visualisation shortly.

But first, we can calculate our actual Tuesday return’s position with the PERCENTRANK function as follows:

=PERCENTRANK(C2:SH8, GETPIVOTDATA("Log_Return",analysis!$A$3,"Day_of_Week",2))

This takes all the average daily returns from our random data (referenced by C2:SH8 in the formula above) and calculates where the value we’re interested in (which we get with the GETPIVOTDATA function) lies in this distribution.

The result is 0.964, which tells us that our actual Tuesday return is greater than 96.4% of the randomised realisations. Most people would consider this significant.

To see this visually, we use the empirical cumulative distribution function (on the sheet “ECDF”) which shows the same information as the histogram, but plots it in a cumulative fashion.

The x-axis shows the average day-of-the-week return value, and the y-axis shows the percentage of values that are smaller than the value on the x-axis.

Our actual Tuesday return is plotted in orange.

You can see that our actual return is larger than about 96% of random day-of-week returns.

Comparison with Classical Statistics

The t-test looks at two samples and computes the likelihood that they are taken from two different populations.

We’ll set up a sheet that performs a t-test on the Tueday and non-Tuesday returns to see if they’re significantly different.

In the sheet named “t-test”, copy across the date, day-of-the-week, and log return data. Create a new column, is_season, and set the values according to an if statement. The first cell looks like this:

=IF(C2=2, 1, 0)

We then copy that formula down the entire is_season column.

Then, we can do a t-test on the Tuesday vs non-Tuesday returns by filtering our returns on the is_season column:

=T.TEST(FILTER(B:B, D:D=0),FILTER(B:B, D:D=1),2,3)`

This results in a p-value of 0.1, indicating weak evidence against the null hypothesis (that the Tuesday and non-Tuesday returns come from the same distribution). It does not meet the conventional 5% significance threshold, but some might consider it suggestive.

I prefer the empirical Monte Carlo approach over this statistical parametric test because:

  • The empirical test is intuitively much easier to understand than the parametric test.
  • Without training in statistical methods, you don’t really know what the t-test is doing.
  • On the other hand, it’s obvious what our empirical Monte Carlo approach is doing.

In addition, the t-test has certain assumptions baked in, such as normality. It’s not clear how much a violation of this assumption matters, but the empirical Monte Carlo approach has no assumptions about the distribution of the data.

That said, the Monte Carlo approach does assume that the data are independent and identically distributed (i.i.d.). Autocorrelation (aka trendiness, which I believe is present in BTC returns) would invalidate this assumption – but since it is likely mild, and we are exploring broad seasonal patterns, the Monte Carlo approach is likely reasonably robust.

Summary

Monte Carlo can help answer the question “What are the chances I’d see this effect by chance?”

Of course, it comes with the usual caveats around overfitting and cherry-picking results to fit your hypothesis. It’s also worth thinking about the structure of your data – for instance, is autocorrelation present?

Remember that analysis of market effects requires gathering and weighing evidence from many different angles. Monte Carlo can be considered one of many such windows into an effect.

In this case, we see weak evidence that BTC returns on Tuesdays might be higher than on other days. The Monte Carlo test wasn’t overwhelmingly in favour of the effect. There’s also no strong structural or risk-based reason for this to hold. Plus, the effect isn’t consistent over time.

Without a solid foundation to back it up, I’m shelving this idea for now.

Remember that even if you find that an effect likely isn’t due to chance, there’s no guarantee that you’ll make money trading it:

  • Maybe it’s real but too small to overcome trading costs.
  • Maybe it’s a real effect, but has decayed because it’s well-known.

Often, the art in applying this technique is in thinking about the question that you want answered and framing the problem sensibly. Sometimes it will be obvious, as in our example here, and other times less so.

But Monte Carlo is a useful tool to add to your data analysis toolkit, allowing you to make meaningful statistical comparisons without resorting to classical statistics.

Excel can be used to perform such analysis without the use of specialised statistical software. In this example, we:

  • Computed daily returns
  • Grouped by day of week
  • Measured average returns
  • Ran basic randomisation tests
  • Performed a classical t-test

Feel free to adapt or extend this Excel method to investigate other time-based seasonality, or any other hypothesis you want to check with a randomisation test.

Leave a Comment