From Potential to Proven: Why AI is Taking Off in the Finance World

This article is a departure from the quantitative research that usually appears on the Robot Wealth blog. Until recently, I was working as a machine learning consultant to financial services organizations and trading firms in Australia and the Asia Pacific region. A few months ago, I left that world behind to join an ex-client’s proprietary trading firm. I thought I’d jot down a few thoughts about what I saw during my consulting time because I witnessed some interesting changes in the industry in a relatively short period of time that I think you might find interesting too. Enjoy!

Perceptions around Artificial Intelligence (AI) in the finance industry have changed signifcantly, as scepticism gives way to a rising Fear of Missing Out (FOMO) among asset managers and trading houses.

Big Data and AI Strategies – Machine Learning and Alternative Data Approaches to Investing, JP Morgan’s 280-page report on the future of machine learning in the finance industry, paints a picture of a future in which alpha is generated from data sources such as social media, satellite imagery, and machine-classified company filings and news releases.

Well that future is already here.

Amongst value managers, I saw scepticism become replaced with a sense of anxiety over being late to the party. The first question I was asked by nearly every value manager I met over the last year or so was: “what is everyone else doing with machine learning?”

This sense of FOMO is arising now because general knowledge of the potential of machine learning has reached a critical mass amongst the decision makers and management across the industry.

Despite the seclusion inherent in our industry, where ‘secret sauce’ is closely guarded, the fruits of the labour of the early adopters are gaining ever-increasing public exposure, shifting the perception of the technology from ‘potential’ to ‘proven’.

In short, finance is catching up to the many other industries where this technology is already in common use.

Shifting attitudes within the quant community

When my consulting company first started applying and recommending machine learning solutions to financial problems, we encountered mixed attitudes from the industry. While a few were enthusiastic adopters who could see the potential, the attitude that machine learning was less than useful – even dangerous – and dismissals of the technology as ‘voodoo science’ were incredibly common.

Surprisingly, these attitudes often came from other quant researchers. 

Within the quant community, I’ve witnessed first-hand this attitude gradually giving way to one of recognition of machine learning as a useful tool. I’ve even noted some folks who decried the approach now calling themselves ‘machine learning experts’ on their business cards and LinkedIn profiles. Times really have changed, and they changed in an astonishingly short space of time.

More recently, I’ve seen an even more significant change, as participants increasingly recognise machine learning as the key to unlocking the next generation of alpha. Suddenly, it feels like the prevailing attitude towards machine learning and AI is one of excited and enthusiastic adoption, as opposed to reluctance and scepticism.

Amid the growing consensus that alpha is discoverable in alternative data, our own work and the work of others suggests that alpha from such sources may be uncorrelated with traditional factors like value and momentum. Perhaps, for the time being at least, they can coexist and even provide new dimensions of diversification.

Changing the way we look for alpha 

Alpha generation has always been about information advantage – either having access to uncommon insights gained through ingenuity or common insights acted upon before everyone else.

Machine learning and artificial intelligence is simply the modern evolution of a repeating historical pattern in the context of today’s big data world. For example, interpreting satellite imagery of a retailer’s car park reveals insight about its sales figures before they are released to the market. Deriving sentiment from Twitter or Weibo and relating it to an asset’s returns provides an uncommon insight gained through ingenuity.

Artificial intelligence excels at tasks like these to the point that such AI is rapidly becoming a commodity.

As the pool of data (be it alternative, big, structured or unstructured) continues its exponential growth, machine learning and artificial intelligence tools will increasingly be adopted for processing and unravelling it – simply because they are the best tools for the job.

JP Morgan believes there will come a time when they are the only tools for the job.

My experience tells me that that time has already arrived – fund managers who are slow to the party would do well to get on board to not only build competitive advantage, but to maintain what they’ve already got.


Have you witnessed a shift in the way machine learning and artificial intelligence is viewed and used in the finance industry? I’d love to hear about other people’s experiences in the comments.

Getting Started with Neural Networks for Algorithmic Trading

This article is adapted from one of the units of Advanced Algorithmic Trading. If you like what you see, check out the entire curriculum here. Find out what Robot Wealth is all about here.

If you’re interested in using artificial neural networks (ANNs) for algorithmic trading, but don’t know where to start, then this article is for you. Normally if you want to learn about neural networks, you need to be reasonably well versed in matrix and vector operations – the world of linear algebra. This article is different. I’ve attempted to provide a starting point that doesn’t involve any linear algebra and have deliberately left out all references to vectors and matrices. If you’re not strong on linear algebra, but are curious about neural networks, then I think you’ll enjoy this introduction. In addition, if you decide to take your study of neural networks further, when you do inevitably start using linear algebra, it will probably make a lot more sense as you’ll have something of head start.

The best place to start learning about neural networks is the perceptron. The perceptron is the simplest possible artificial neural network, consisting of just a single neuron and capable of learning a certain class of binary classification problems.1 Perceptrons are the perfect introduction to ANNs and if you can understand how they work, the leap to more complex networks and their attendant issues will not be nearly as far. So we will explore their history, what they do, how they learn, where they fail. We’ll build our own perceptron from scratch and train it to perform different classification tasks which will provide insight into where they can perform well, and where they are hopelessly outgunned. Lastly, we’ll explore one way we might apply a perceptron in a trading system.2

A Brief History of the Perceptron

The perceptron has a long history, dating back to at least the mid 1950s. Following its discovery, the New York Times ran an article that claimed that the perceptron was the basis of an artificial intelligence (AI) that would be able to walk, talk, see and even demonstrate consciousness. Soon after, this was proven to be hyperbole on a staggering scale, when the perceptron was shown to be wholly incapable of classifying certain types of problems. The disillusionment that followed essentially led to the first AI winter, and since then we have seen a repeating pattern of hyperbole followed by disappointment in relation to artificial intelligence.3

Still, the perceptron remains a useful tool for some classification problems and is the perfect place to start if you’re interested in learning more about neural networks. Before we demonstrate it in a trading application, let’s find out a little more about it.

Artificial Neural Networks: Modelling Nature

Algorithms modelled on biology are a fascinating area of computer science. Undoubtedly you’ve heard of the genetic algorithm, which is a powerful optimization tool modelled on evolutionary processes. Nature has been used as a model for other optimization algorithms, as well as the basis for various design innovations. In this same vein, ANNs attempt to learn relationships and patterns using a somewhat loose model of neurons in the brain. The perceptron is a model of a single neuron.4

In an ANN, neurons receive a number of inputs, weight each of those inputs, sum the weights, and then transform that sum using a special function called an activation function, of which there are many possible types. The output of that activation function is then either used as the prediction (in a single neuron model) or is combined with the outputs of other neurons for further use in more complex models, which we’ll get to in another article.

Here’s a sketch of that process in an ANN consisting of a single neuron:

Here, \(x_1, x_2, etc\) are the inputs. \(b\) is called the bias term, think of it like the intercept term in a linear model \(y = mx + b\). \(w_1, w_2, etc\) are the weights applied to each input. The neuron firstly sums the weighted inputs (and the bias term), represented by \(S\) in the sketch above. Then, \(S\) is passed to the activation function, which simply transforms \(S\) in some way. The output of the activation function, \(z\) is then the output of the neuron.

The idea behind ANNs is that by selecting good values for the weight parameters (and the bias), the ANN can model the relationships between the inputs and some target. In the sketch above, \(z\) is the ANN’s prediction of the target given the input variables.

In the sketch, we have a single neuron with four weights and a bias parameter to learn. It isn’t uncommon for modern neural networks to consist of hundreds of neurons across multiple layers, where the output of each neuron in one layer is input to all the neurons in the next layer. Such a fully connected network architecture can easily result in many thousands of weight parameters. This enables ANNs to approximate any arbitrary function, linear or nonlinear.

The perceptron consists of just a single neuron, like in our sketch above. This greatly simplifies the problem of learning the best weights, but it also has implications for the class of problems that a perceptron can solve.

What’s an Activation Function?

The purpose of the activation function is to take the input signal (that’s the weighted sum of the inputs and the bias) and turn it into an output signal. There are many different activation functions that convert an input signal in a slightly different way, depending on the purpose of the neuron.

Recall that the perceptron is a binary classifier. That is, it predicts either one or zero, on or off, up or down, etc. It follows then that our activation function needs to convert the input signal (which can be any real-valued number) into either a one or a zero5corresponding to the predicted class.

In biological terms, think of this activation function as firing (activating) the neuron (telling it to pass the signal on to the next neuron) when it returns 1, and doing nothing when it returns 0.

What sort of function accomplishes this? It’s called a step function, and its mathematical expression looks like this:

\[f(z) =\begin{cases} 1, & \text{if $z$ > 0} \\ 0, & \text{otherwise} \end{cases}\]

And when plotted, it looks like this:

This function then transforms any weighted sum of the inputs (S) and converts it into a binary output (either 1 or 0). The trick to making this useful is finding (learning) a set of weights, \(w\), that lead to good predictions using this activation function.

How Does a Perceptron Learn?

We already know that the inputs to a neuron get multiplied by some weight value particular to each individual input. The sum of these weighted inputs is then transformed into an output via an activation function. In order to find the best values for our weights, we start by assigning them random values and then start feeding observations from our training data to the perceptron, one by one. Each output of the perceptron is compared with the actual target value for that observation, and, if the prediction was incorrect, the weights adjusted so that the prediction would have been closer to the actual target. This is repeated until the weights converge.

In perceptron learning, the weight update function is simple: when a target is misclassified, we simply take the sign of the error and then add or subtract the inputs that led to the misclassifiction to the existing weights.

If that target was -1 and we predicted 1, the error is \(-1 -1 = -2\). We would then subtract each input value from the current weights (that is, \(w_i = w_i – x_i\)). If the target was 1 and we predicted -1, the error is \(1 – -1 = 2\), so then add the inputs to the current weights (that is, \(w_i = w_i + x_i\)).6

This has the effect of moving the classifier’s decision boundary (which we will see below) in the direction that would have helped it classify the last observation correctly. In this way, weights are gradually updated until they converge. Sometimes (in fact, often) we’ll need to iterate through each of our training observations more than once in order to get the weights to converge. Each sweep through the training data is called an epoch.

Implementing a Perceptron from Scratch

Next, we’ll code our own perceptron learning algorithm from scratch using R. We’ll train it to classify a subset of the iris data set.7

In the full iris data set, there are three species. However, perceptrons are for binary classification (that is, for distinguishing between two possible outcomes). Therefore, for the purpose of this exercise, we remove all observations of one of the species (here, virginica), and train a perceptron to distinguish between the remaining two. We also need to convert the species classification into a binary variable: here we use 1 for the first species, and -1 for the other. Further, there are four variables in addition to the species classification: petal length, petal width, sepal length and sepal width.  For the purposes of illustration, we’ll train our perceptron using only petal length and width and drop the other two measurements. These data transformations result in the following plot of the remaining two species in the two-dimensional feature space of petal length and petal width:

The plot suggests that petal length and petal width are strong predictors of species – at least in our training data set. Can a perceptron learn to tell them apart?

Training our perceptron is simply a matter of initializing the weights (here we initialize them to zero) and then implementing the perceptron learning rule, which just updates the weights based on the error of each observation with the current weights. We do that in a for()  loop which iterates over each observation, making a prediction based on the values of petal length and petal width of each observation, calculating the error of that prediction and then updating the weights accordingly.

In this example we perform five sweeps through the entire data set, that is, we train the perceptron for five epochs. At the end of each epoch, we calculate the total number of misclassified training observations, which we hope will decrease as training progresses. Here’s the code:

Here’s the plot of the error rate:

We can see that it took two epochs to train the perceptron to correctly classify the entire dataset. After the first epoch, the weights hadn’t been sufficiently updated. In fact, after epoch 1, the perceptron predicted the same class for every observation! Therefore it misclassified 50 out of the 100 observations (there are 50 observations of each species in the data set). However after two epochs, the perceptron was able to correctly classify the entire data set by learning appropriate weights.

Another, perhaps more intuitive way, to view the weights that the perceptron learns is in terms of its decision boundary. In geometric terms, for the two-dimensional feature space in this example, the decision boundary is the a straight line separating the perceptron’s predictions. On one side of the line, the perceptron always predicts -1, and on the other, it always predicts 1.8

We can derive the decision boundary from the perceptron’s activation function:

\[f(z) =\begin{cases} 1, & \text{if $z$ > 0} \\ 0, & \text{otherwise} \end{cases}\]

where \(z = w_1x_1 + w_2x_2 + b\)

The decision boundary is simply the line that defines the location of the step in the activation function. That step occurs at \(z=0\), so our decision boundary is given by

\[w_1x_1 + w_2x_2 + b\ = 0 \]

Equivalently \[x_2 = -\frac{w_1}{w_2}x_1 – \frac{b}{w_2}\]

which defines a straight line in \(x_1, x_2\) feature space.

In our iris example, the perceptron learned the following decision boundary:

Here’s the complete code for training this perceptron and producing the plots shown above:


Congratulations! You just built and trained your first neural network.

Let’s now ask our perceptron to learn a slightly more difficult problem. Using the same iris data set, this time we remove the setosa species and train a perceptron to classify virginica and versicolor on the basis of their petal lengths and petal widths. When we plot these species in their feature space, we get this:

This looks a slightly more difficult problem, as this time the difference between the two classifications is not as clear cut. Let’s see how our perceptron performs on this data set.

This time, we introduce the concept of the learning rate, which is important to understand if you decide to pursue neural networks beyond the perceptron. The learning rate controls the speed with which weights are adjusted during training. We simply scale the adjustment by the learning rate: a high learning rate means that weights are subject to bigger adjustments. Sometimes this is a good thing, for example when the weights are far from their optimal values. But sometimes this can cause the weights to oscillate back and forth between two high-error states without ever finding a better solution. In that case, a smaller learning rate is desirable, which can be thought of as fine tuning of the weights.

Finding the best learning rate is largely a trial and error process, but a useful approach is to reduce the learning rate as training proceeds. In the example below, we do that by scaling the learning rate by the inverse of the epoch number.

Here’s a plot of our error rate after training in this manner for 400 epochs:

You can see that training proceeds much less smoothly and takes a lot longer than last time, which is a consequence of the classification problem being more difficult. Also note that the error rate is never reduced to zero, that is, the perceptron is never able to perfectly classify this data set. Here’s a plot of the decision boundary, which demonstrates where the perceptron makes the wrong predictions:

Here’s the code for this perceptron:

Where Do Perceptrons Fail?

In the first example above, we saw that our versicolor and setosa iris species could be perfectly separated by a straight line (the decision boundary) in their feature space. Such a classification problem is said to be linearly separable and (spoiler alert) is where perceptrons excel. In the second example, we saw that versicolor and virginica were almost linearly separable, and our perceptron did a reasonable job, but could never perfectly classify the whole data set. In this next example, we’ll see how they perform on a problem that isn’t linearly separable at all.

Using the same iris data set, this time we classify our iris species as either versicolor or other (that is setosa and virginica get the same classification) on the basis of their petal lengths and petal widths. When we plot these species in their feature space, we get this:

This time, there is no straight line that can perfectly separate the two species. Let’s see how our perceptron performs now. Here’s the error rate over 400 epochs and the decision boundary:

We can see that the perceptron fails to distinguish between the two classes. This is typical of the performance of the perceptron on any problem that isn’t linearly separable. Hence my comment at the start of this unit (see footnote 2) that I’m skeptical that perceptrons can find practical application in trading. Maybe you can find a use case in trading, but even if not, they provide an excellent foundation for exploring more complex networks which can model more complex relationships.

A Perceptron Implementation for Algorithmic Trading

The Zorro trading automation platform includes a flexible perceptron implementation. If you haven’t heard of Zorro, it is a fast, accurate and powerful backtesting/execution platform that abstracts a lot of tedious programming tasks so that the user is empowered to concentrate on efficient research. It uses a simple C-based scripting language that takes almost no time to learn if you already know C, and a week or two if you don’t (although of course mastery can take much longer). This makes it an excellent choice for independent traders and those getting started with algorithmic trading. While the software sacrifices little for the abstraction that enables efficient research, experienced quant developers or those with an abundance of spare time might take issue with that aspect of the software, as it’s not open source, so it isn’t for everyone. But it’s a great choice for beginners and DIY traders who maintain a day job. If you want to learn to use Zorro, even if you’re not a programmer, we can help.

Zorro’s perceptron implementation allows us to define any features we think are pertinent, and to specify any target we like, which Zorro automatically converts it to a binary variable (by default, positive values are given one class; negative values the other). After training, Zorro’s perceptron predicts either a positive or negative value corresponding to the positive and negative classes respectively.

Here’s the Zorro code for implementing a perceptron that tries to predict whether the 5-day price change in the EUR/USD exchange rate will be greater than 200 pips, based on recent returns and volatility, whose predictions are tested under a walk-forward framework:

Zorro firstly outputs a trained perceptron for predicting long and short 5-day price moves greater than 200 pips for each walk-forward period, and then tests their out-of-sample predictions.

Here’s the walk-forward equity curve of our example perceptron trading strategy:

I find this result particularly interesting because I expected the perceptron to perform poorly on market data, which I find it hard to imagine falling into the linearly separable category. However, sometimes simplicity is not a bad thing, it seems.


I hope this article not only whet your appetite for further exploration of neural networks, but facilitated your understanding of the basic concepts, without getting too hung up on the math.

I intended for this article to be an introduction to neural networks where the perceptron was to be nothing more than a learning aid. However, given the surprising walk-forward result from our simple trading model, I’m now going to experiment with this approach a little further. If this interests you too, some ideas you might consider include extending the backtest, experimenting with different signals and targets, testing the algorithm on other markets and of course considering data mining bias. I’d love to hear about your results in the comments.

Thanks for reading!

We Love Free Data: Replacing Yahoo Finance Market Data

In keeping with our recent theme of providing useful tidbits of algo trading practicalities, here’s an elegant solution that resolves Yahoo’s unceremonious exit from the free financial data space.

Regular readers would know that I use various tools in my algo trading stack, but the one I keep coming back to, particularly when I’m ready to start running serious simulations, is Zorro. Not only is it a fast, accurate, and powerful backtester and execution engine, the development team is clearly committed to solving issues and adding features that really matter, from a practical perspective. This is another example of the speedy and elegant resolution of a serious issue – namely, the loss of free access to good quality, properly adjusted equities data, thanks to Yahoo’s exit.

Zorro version 1.60 is currently undergoing it’s final stages of beta testing and will likely be released publicly in the coming days. The latest version includes integration with Alpha Vantage‘s API, providing access to free, high quality, properly adjusted stock and ETF price data. All you need to do to use it is sign up at Alpha Vantage for a free API key, then enter your key in Zorro’s initialization file. Then, you can use Zorro’s assetHistory()  function with the arguments Name  and FROM_AV  to download data for the stock or ETF denoted by Name  directly from the Alpha Vantage database.

From my testing, this can be used as a direct replacement in any Zorro script that previously used Yahoo data, and it seems to load even faster than Yahoo data did. Here’s a simple example that downloads daily data for a portfolio of ETFs, saves that data locally, then plots returns series for each component and a price chart for one:

And here’s the resulting plot:

Looks good! Now I just need to replace Yahoo Finance with Alpha Vantage in all those scripts….

For readers who are yet to take Zorro for a test drive, I firmly believe that it is the most useful simulation and trading tool for it’s price available on the market today (and the free version packs a mighty punch as well). I’ve watched the platform go from strength to strength as new and better features are added with almost clockwork regularity. It will undoubtedly become ubiquitous as more and more people find out about it.

To become a Zorro ninja and learn to leverage features like the one shown here, join the Robot Wealth community, where you can access courses that will get you up to speed with robust strategy development, even if you’re a novice coder. 

How to Run Trading Algorithms on Google Cloud Platform in 6 Easy Steps

Earlier this year, I attended the Google Next conference in San Francisco and gained some first hand perspective into what’s possible with Google’s cloud infrastructure. Since then, I’ve been leaning on Google Cloud Platform (GCP) to run my trading algorithms (and more) and it has become an important tool in my workflow.

In this post, I’m going to show you how to set up a GCP cloud compute instance to act as a server for hosting a trading algorithm. I’ll also discuss why such a setup can be a good option and when it might pay to consider alternatives. But cloud compute instances are just a tiny fraction of the whole GCP ecosystem, so before we go any further, let’s take a high level overview of the various components that make up GCP.

What is Google Cloud Platform?

GCP consists of a suite of cloud storage, compute, analytics and development infrastructure and services. Google says that GCP runs on the very same infrastructure that Google uses for its own products, such as Google Search. This suite of services and infrastructure goes well beyond simple cloud storage and compute resources, providing some very handy and affordable machine learning, big data, and analytics tools.

GCP consists of:

  • Google Compute Engine: on-demand virtual machines and an application development platform.
  • Google Storage: scalable object storage; like an (almost) infinite disk drive in the cloud.
  • BigTable and Cloud SQL: scalable NoSQL and SQL databases hosted in the cloud.
  • Big Data Tools:
    • BigQuery: big data warehouse geared up for analytics
    • DataFlow: data processing management
    • DataProc: managed Spark and Hadoop service
    • DataLab: analytics and visualization platform, like a Jupyter notebook in the cloud.
    • Data Studio: for turning data into nice visualizations and reports
  • Cloud Machine Learning: train your own models in the cloud, or access Google’s pre-trained neural network models for video intelligence, image classification, speech recognition, text processing and language translation.
  • Cloud Pub/Sub: send and receive messages between independent applications.
  • Management and Developer Tools: monitoring, logging, alerting and performance analytics, plus command line/powershell tools, hosted git repositories, and other tools for application development.
  • More that I haven’t mentioned here!

The services and infrastructure generally play nicely with each other and with the standard open source tools of development and analytics. For example, DataLab integrates with BigQuery and Cloud Machine Learning and runs Python code. Google have tried to make GCP a self-contained, one-stop-shop for development, analytics, and hosting. And from what I have seen, they are succeeding.

Using Google Compute Engine to Host a Trading Algorithm


Google Compute Engine (GCE) provides virtual machines (VMs) that run on hardware located in Google’s global network of data centres (a VM is simply an emulation of a computer system that provides the functionality of a physical computer). You can essentially use a VM just like you would a normal computer, without actually owning the requisite hardware. In the example below, I used a VM instance to:

  • Host and run some software applications (Zorro and R) that execute the code for the trading system.
  • Connect to a broker to receive market data and execute trades (in this case, using the Interactive Brokers IB Gateway software).

GCE allows you to quickly launch an instance using predefined CPU, RAM and storage specifications, as well as to create your own custom machine. You can also select from several pre-defined ‘images’, which consist of the operating system (both Linux and Windows options are available), its configuration and some standard software. What’s really nice is that that GCE enables you to create your own custom image that includes the software and tools specific to your use case. This means that you don’t have to upload your software and trading infrastructure each time you want to launch a new instance – you can simply create an instance from an image that you saved previously.

Before we get into a walk-through of setting up an instance and running a trading algorithm, I will touch on the advantages and disadvantages of GCE for this use case, as well as the cost.

Pros and Cons of Running Trading Algorithms on Google Compute Engine

There’s a lot to like about using GCE for managing one’s trading infrastructure. But of course, there will always be edge cases where other solutions will be more suitable. I can only think of one (see below), but if you come up with more, I’d love to hear about them in the comments.


  • GCE abstracts the need to maintain or update infrastructure, which allows the trader to focus on high-value tasks instead, like performance monitoring and further research.
  • The cost of a cloud compute instance capable of running a trading algorithm is very reasonable (I’ll go into specifics below). In addition, you only pay for what you use, but can always increase the available resources if needed.
  • Imaging: it is possible to create an ‘image’ of your operating system configuration and any applications/packages necessary to run your algorithm. You can start a new compute instance with that image without having to manually install applications and configure the operating system. This is a big time-saver.
  • Scalability: if you find that you need more compute resources, you can add them easily, however this will interrupt your algorithm.
  • Security: Google claim to have excellent security and employ a team of over 750 experts in that field, and take measures to protect the physical security of their data centres and the cybersecurity of their servers and software.
  • Uptime: Google commits to providing 99.95% uptime for GCE. If that level of uptime isn’t met in any particular month, Google issues credit against future billing cycles.
  • Access to other services: since the GCP services all play nicely together, you can easily access storage, data management, and analytical tools to compliment or extend a compute instance, or indeed to build a bigger workflow on GCP that incorporates data management, research and analytics.


  • If your trading algorithm is latency sensitive, GCE may not be the best solution. While you do have some choice over where your algorithm is physically hosted, this won’t be ideal if latency is a significant concern. For the vast majority of independent traders, this is unlikely to be a deal-breaker, but it is certainly worth mentioning.

I was almost going to list security as a disadvantage, since it can be easy to assume that if security is not handled in-house, then it is a potential issue. However, one would think that Google would do security much better than any individual could possibly do (at least, that’s what you’d think after reading Google’s spiel on security), and that therefore it makes sense to include the outsourcing of security as an advantage. This issue might get a little more complicated for a trading firm which may prefer to keep security in-house, but for most individuals it probably makes sense to outsource it to an expert.


GCE is surprisingly affordable. The cost of hosting and running my algorithm is approximately 7.6 cents per hour, which works out to around $55 per month (if I leave the instance running 24/7) including a sustained use discount, which is applied automatically. Factoring the $300 of free credit I received for signing up for GCP, the first year’s operation will cost me about $360.

This price could come down significantly, depending on the infrastructure I use, as I’ll explain below.

I used an n1-standard-1 machine from GCE’s list of standard machine types. This machine type utilizes a single CPU and allocates 3.75 GB of memory, and I attached a 50GB persistent disk. This was enough to run my trading algorithm via the Zorro trading automation software (which requires Windows), executed through Interactive Brokers via the IB Gateway. The algorithm in question generates trading signals (for a portfolio of three futures markets) by processing hourly data with callouts to a feedforward neural network written as an R script, and it monitors tick-wise price data for micro-management of individual trades. The machine type I chose handled this job reasonably well, despite recommendations from Google’s automated monitoring that I assign some additional resources. These recommendations generally arise as a result of retraining my neural network, a task that proved to be more resource intensive than the actual trading. Thankfully, this only happens periodically and I have so far chosen to ignore Google’s recommendations without apparent negative consequence.

I used a Windows Server 2016 image (since my trading application runs on Windows only) and a 50GB persistent disk, which is the minimum required to run such an image. The Windows Server image accounts for the lion’s share of the cost – approximately $29 per month.

A scaled down version running Linux (Ubuntu 17.04) with a smaller persistent disk runs at less than half this cost: 3.4 cents per hour or $24.67 per month with a sustained use discount. Clearly there are big savings to be made if you can move away from Windows-based applications for your trading infrastructure.

Also worth mentioning is that you are only charged for what you use. If you need to stop your algorithm in the middle of the month, you’ll only be charged for the time that your instance was actually running. Most of the niche providers of private trading servers will charge you at best for the full month, regardless of when you stop running your algorithm.

How to Run a Trading Algorithm on GCE

As you can see from the previous descriptions, GCP consists of a LOT of different services. Finding one’s way around for the first time can be a bit tricky. This part of the article consists of a walk-through on setting up and running a trading algorithm on GCE, aimed at the new GCP user.

Step 1: Sign up for GCP

Go to and log in to your Google account (or sign up for an account if you don’t have one). Note the $300 in free credits you receive for use within the first 12 months.

Step 2: Navigate to Google Compute Engine

Firstly, from the GCP homepage, navigate to your GCP Console via one of the options shown below:

Then, navigate to the Compute Engine dashboard like so:

Step 3: Create a new VM instance

Simply click on Create in the VM Instances screen on your GCE dashboard, like so:

Then fill out the specs for your new instance. The specs I used look like this (you can see the cost estimate on the right):

I used one of GCP’s US east-coast data centres since IB’s servers are located in the New York area. My algorithm isn’t latency sensitive, but every little bit helps.

After clicking Create, the instance will take a few moments to spin up, then it will appear in your VM dashboard like so:

Step 4: Set up access control

Next, you need to set up a password for accessing the instance.  Click the arrow next to RDP and select Set Windows password like so:

Follow the prompts, and then copy the password and keep safe. Now you’re ready to connect to your instance!

Step 5: Connect and test

You can connect directly from the VM dashboard using Google’s remote desktop application for Chrome by clicking RDP (ensuring the correct VM is selected), or download the Windows RDP:

Once connected to the instance, it should look like a normal (although somewhat Spartan) Windows desktop. To test that it can connect to Interactive Brokers (IB), we are going to connect to IB’s test page. But first, we have to adjust some default internet settings. To do this, open Internet Explorer. Select the Settings cog in the top right of the browser then Internet Options, then Security, then Trusted Sites. Click the Sites button and add to the list of trusted sites. Then save the changes. Here’s a visual from my instance:

Now, connect to IB’s test page to check that your instance can communicate with IB’s servers. Simply navigate to in Internet Explorer. If the instance is connecting properly, you should see a page that looks like this:

You can now upload your trading software and algorithm to your instance by simply copying and pasting from your home computer, or download any required software from the net. Note that to copy and paste from your home computer, you will need to access the instance using Windows RDP, not Chrome RDP (this may change with future updates to Chrome RDP).

Gotcha: changing permissions of root directory for Windows Server:

I found that I wasn’t able to install R packages from script due to restrictions on accessing certain parts of the Windows file structure. To resolve this, I followed these steps:

  • In Windows Explorer, navigate to the R installation directory and right-click it, then choose Properties.
  • Go to the Security tab.
  • Click Advanced, then Change Permissions.
  • Highlight your username, and click Edit.
  • Choose This folder, subfolders and files under Applies to:
  • Choose Full Control under Basic Permissions.
  • Click OK.

Step 6: Don’t’ forget to stop the instance!

If you need to stop trading your algorithm, it is usually a good idea to stop the instance so that you aren’t charged for compute resources that you aren’t using. Do so from the VM dashboard:

So long as you don’t delete the instance, you can always restart it from the same state at which it was stopped, meaning you don’t have to re-upload your software and scripts. You are not billed for an instance that has been stopped.

On the other hand, if you delete your instance and later want to restart, you will have to create a whole new instance and re-upload all your trading infrastructure. That’s where images come in handy: you can save an image of your setup, and then start an identical instance from the console. I’ll show you how to do that in another post.

Concluding Remarks

GCP on the Command Line

In this post I’ve demonstrated how to set up and run instances using the GCP Console. The same can be achieved using the Gcloud Command Line tool, which is worth learning to use if you start using GCP extensively thanks to the boost in productivity that comes with familiarity.

Going Further

There’s a lot that can be done on GCP, including big data analytics and machine learning. We can also apply some simpler workflows to make our lives easier, such as creating custom images as mentioned above, or integrating with cloud storage infrastructure for managing historical data and using Data Studio for monitoring performance via attractive dashboard-style interfaces. I’m in a good position to show you the ropes on how to use these tools in your trading workflow, so if there is something in particular that you would like me to showcase, let me know in the comments.

Happy Googling!