All Models Are Wrong. Some Are Random.
Updated: Feb 9
Disclaimer: This post is for those who really like to geek out on the inner workings of Monte Carlo Simulations. If you are not interested in the inner workings of these simulations, hopefully, you will find our other blog posts more to your liking!
Have you ever wondered why we choose to implement Monte Carlo Simulations (MCS) the way we do in ActionableAgile™️ Analytics (AA)?
Before we get too deep into answering that question, it is worthwhile to first take a step back and talk about the one big assumption all Monte Carlo Simulations in AA make and that is that the future we are trying to predict roughly looks like the past we have data for. For example, in North America, is it reasonable to use December's data to forecast what can be done in January? Maybe not. In Europe, can we use August's data to predict what can be done in September? Again, probably not. The trick, then, is to find a time period in the past that we believe will accurately reflect what will happen in the future we want to forecast. If you don't account for this assumption, then any Monte Carlo Simulation you run will be invalid.
The big assumption: The future we are trying to predict roughly looks like the past we have data for.
Let's say we do account for this assumption, and we have a set of historical data that we are confident to plug into our simulation. The way AA works, then, is to say that ANY day in the past data can look like ANY day in the future that we are trying to forecast. So, we randomly sample data from a day in the past (we treat each day in the past as equally likely) and assign that data value to a day in the future. We do this sampling thousands of times to understand the risk associated with all the outcomes that show up in our MCS results.
Each day in the past is treated as equally likely (to happen in the future.)
But let's think about this for a second. We are assigning a random day in the past to a random day in the future. Doesn't that violate our big assumption that we just talked about? In other words, if any day from the past can look like any day in the future, then we could presumably (and almost certainly do) use data from a past Monday and assign it to a future Saturday. Or we use data from a past Sunday and assign it to a future Wednesday. Surely, Mondays in the past don't look like Saturdays in the future, and Sundays in the past don't look like Wednesdays in the future, right?
Doesn't this mean that we should refine our sampling algorithm and make it a bit more sophisticated in order to eliminate these obvious mistakes? I.e., shouldn't we have an algorithm that only assigns past Mondays to future Mondays or past Sundays to future Sundays? Or even just assign past weekdays to future weekdays and past weekends to future weekends?
Well, Prateek Singh did just that when he tried different sampling algorithms for different simulations, and the results may surprise you. I highly encourage you to read his blog here as it is the more scientific justification for why we use the sampling algorithm that we do in AA. I don't want to ruin the surprise for you but (spoiler alert) with AA, we chose the best one.
About Daniel Vacanti, Guest Writer
Daniel Vacanti is the author of the highly-praised books "When will it be done?" and "Actionable Agile Metrics for Predictability" and the original mind behind the ActionableAgile™️ Analytics Tool. Recently, he co-founded ProKanban.org, an inclusive community where everyone can learn about Professional Kanban, and he co-authored their Kanban Guide.
When he is not playing tennis in the Florida sunshine or whisky tasting in Scotland, Daniel can be found speaking on the international conference circuit, teaching classes, and creating amazing content for people like us.