Posted on May 12, 2020 by Robot James

In this post, we are going to construct snapshots of historic S&P 500 index constituents, from freely available data on the internet. Why? Well, one of the biggest challenges in looking for opportunities amongst a broad universe of stocks is choosing what stock "universe" to look at. One approach to dealing with this is to pick the stocks that are currently in the S&P 500 index. Unfortunately, the stocks that are currently in the S&P 500 index weren't all there last year. A third of them weren't there ten years ago... If we create a historical data set by picking current S&P 500 index constituents, then we will be including historical data for smaller stocks that weren't in the index at that time. These are all going to be stocks that did very well, historically, or else they wouldn't have gotten in the index! So this universe selection technique biases our stock returns higher. The average past returns of current SPX constituents is higher than the average past returns of historic SPX constituents, due to this upward bias. It's easy...

Posted on Jun 09, 2019 by Kris Longmore
No Comments.

I'm a bit late to the party with this one, but I was recently introduced to the feather format for working with tabular data. And let me tell you, as far as reading and writing data goes, it's fast. Really fast. Not only has it provided a decent productivity boost, but the motivation for its development really resonates with me, so I figured I'd briefly share my experiences for any other latecomers to the feather party. What is feather? It's a binary file format for storing data frames - the near-universal data container of choice for data science. Why should you care? Have I already mentioned that reading and writing feather files is fast? Check this out.  Here I've created a pandas data frame with one million rows and ten columns. Here's how long it took to write that data frame to disk using both feather and gzip:  Yes, you read that correctly: 94 milliseconds for feather versus 33 seconds for gzip! Here's the read time for the each format:  Platform agnostic The other thing I like about...

Posted on Jun 11, 2018 by Kris Longmore
1 Comment.

Cryptocompare is a platform providing data and insights on pretty much everything in the crypto-sphere, from market data for cryptocurrencies to comparisons of the various crytpo-exchanges, to recommendations for where to spend your crypto assets. The user-experience is quite pleasant, as you can see from the screenshot of their real-time coin comparison table: As nice as the user-interface is, what I really like about Cryptocompare is its API, which provides programmatic access to a wealth of crypto-related data. It is possible to drill down and extract information from individual exchanges, and even to take aggregated price feeds from all the exchanges that Cryptocompare is plugged into - and there are quite a few! Interacting with the Cryptocompare API When it comes to interacting with Cryptocompare's API, there are already some nice Python libraries that take care of most of the heavy lifting for us. For this post, I decided to use a library called cryptocompare . Check it out on Git Hub here. You can install the current stable release by doing pip install cryptocompare , but I installed the latest development...

Posted on Jul 29, 2017 by Kris Longmore

In keeping with our recent theme of providing useful tidbits of algo trading practicalities, here's an elegant solution that resolves Yahoo's unceremonious exit from the free financial data space. Regular readers would know that I use various tools in my algo trading stack, but the one I keep coming back to, particularly when I'm ready to start running serious simulations, is Zorro. Not only is it a fast, accurate, and powerful backtester and execution engine, the development team is clearly committed to solving issues and adding features that really matter, from a practical perspective. This is another example of the speedy and elegant resolution of a serious issue - namely, the loss of free access to good quality, properly adjusted equities data, thanks to Yahoo's exit. Zorro version 1.60 is currently undergoing it's final stages of beta testing and will likely be released publicly in the coming days. The latest version includes integration with Alpha Vantage's API, providing access to free, high quality, properly adjusted stock and ETF price data. All you need to do to use it is sign...

Posted on May 21, 2017 by Kris Longmore

Recently, Yahoo Finance - a popular source of free end-of-day price data - made some changes to their server which wreaked a little havoc on anyone relying on it for their algos or simulations. Specifically, Yahoo Finance switched from HTTP to HTTPS and changed the data download URLs. No doubt this is a huge source of frustration, as many backtesting and trading scripts that relied on such data will no longer work. Users of the excellent R package quantmod  however are in luck! The package's author, Joshua Ulrich, has already addressed the change in a development version of quantmod. You can update your quantmod  package to the development version that addresses this issue using this command in R: devtools::install_github("joshuaulrich/quantmod", ref="157_yahoo_502") Of course, you need the devtools  package installed, so do install.packages("devtools")  first if you don't already have it installed. Once the package updates, quantmod::getSymbols(src = "yahoo")  should work just as it did prior to the updates on the Yahoo Finance server. I can verify that this worked for me. Of course, if you don't want to update quantmod to a version that lives on...