Posts about Data Manipulation in R with dplyr

Data Analysis and Edge Extraction for Traders

March 28, 2023March 23, 2023 by Robot James

Towards the end of last year, we ran a couple of free Zoom webinars on: Here are the recordings: Basics of Edge Extraction Data analysis for Traders The colab research notebook for the second session can be found here. (To make sense of it you’ll want to watch the video.)

More Intuitive Joins in dplyr 1.1.0 – how to do an asof join on trades and quotes data

July 8, 2025March 16, 2023 by Kris Longmore

dplyr 1.1.0 was a significant release that makes several common data operations more syntactically intuitive. The most significant changes relate to joins and grouping/aggregating operations. In this post we’ll look at the changes to joins. First, install and load the latest version of dplyr: install.packages(“dplyr”) library(dplyr) A new approach to joins The best way to …

Performant R Programming: Chunking a Problem into Smaller Pieces

July 8, 2025May 28, 2020 by Kris Longmore

When data is too big to fit into memory, one approach is to break it into smaller pieces, operate on each piece, and then join the results back together. Here’s how to do that to calculate rolling mean pairwise correlations of a large stock universe. Background We’ve been using the problem of calculating mean rolling …

Handling a Large Universe of Stock Price Data in R: Profiling with profvis

July 8, 2025May 22, 2020 by Kris Longmore

Recently, we wrote about calculating mean rolling pairwise correlations between the constituent stocks of an ETF. The tidyverse tools dplyr and slider solve this somewhat painful data wrangling operation about as elegantly and intuitively as possible. Why did you want to do that? We’re building a statistical arbitrage strategy that relies on indexation-driven trading in …

How to Wrangle JSON Data in R with jsonlite, purr and dplyr

July 8, 2025May 20, 2020 by Kris Longmore

Working with modern APIs you will often have to wrangle with data in JSON format. This article presents some tools and recipes for working with JSON data with R in the tidyverse. We’ll use purrr::map functions to extract and transform our JSON data. And we’ll provide intuitive examples of the cross-overs and differences between purrr …

How to Calculate Rolling Pairwise Correlations in the Tidyverse

July 8, 2025May 18, 2020 by Kris Longmore

How might we calculate rolling correlations between constituents of an ETF, given a dataframe of prices? For problems like this, the tidyverse really shines. There are a number of ways to solve this problem … read on for our solution, and let us know if you’d approach it differently! First, we load some packages and …

Financial Data Manipulation in dplyr for Quant Traders

July 8, 2025May 14, 2020 by Robot James

In this post, we’re going to show how a quant trader can manipulate stock price data using the dplyr R package. Getting set up and loading data Load the dplyr package via the tidyverse package. if (!require(‘tidyverse’)) install.packages(‘tidyverse’) library(tidyverse) First, load some price data. energystockprices.RDS contains a data frame of daily price observations for 3 …

How To Get Historical S&P 500 Constituents Data For Free

July 8, 2025May 12, 2020 by Robot James

In this post, we are going to construct snapshots of historic S&P 500 index constituents, from freely available data on the internet. Why? Well, one of the biggest challenges in looking for opportunities amongst a broad universe of stocks is choosing what stock “universe” to look at. One approach to dealing with this is to …

The VIX Futures Basis

July 15, 2025April 30, 2020 by Robot James

In the eye of the recent storm, with VIX up over 50, many traders were looking to “short the VIX” using products like TVIX. “Surely it’s going to coming back down?” Well yeah, it will, eventually, but that doesn’t mean that you can profitably short VIX products. First, some basics… What is VIX? VIX is …

Revenge of the Stock Pickers

July 10, 2025March 31, 2020 by Kris Longmore

To say we’re living through extraordinary times would be an understatement. We saw the best part of 40% wiped off stock indexes in a matter of weeks, unprecedented co-ordinated central bank intervention on a global scale, and an unfolding health crisis that for many has already turned into a tragedy. As an investor or trader, …

dplyr