In our first post on Monte Carlo simulation, we set the stage for estimating scenarios with Monte Carlo simulation and defining the problem we are attacking. Today, we will discuss how we go about it.
Deployments from Open FEMA data have three key random variables of interest that impact forecasting how many people will be deployed using historical information:
First, we need to understand how many people are deployed for each event. Events requiring deployment have an initial date of declaration. We bundle these declarations at a weekly basis, then look at how many people are deployed for events happening each week. In other words, we aren't counting events, but rather people deployed for each week's events. The histogram of these counts can be seen in Figure 1.
Given the distribution of people deployed for each week, we then need to know how long people wait until they are deployed. Sometimes it may take time to get people out into the field, sometimes estimates of needs may change. The histogram of awaiting deployment can be seen in Figure 2.
Finally, once an individual is deployed, that person may be deployed for a varying amount of time. The histogram of weeks deployed can be seen in Figure 3.
Understanding these together help us understand how many total people concurrently deployed.
Figure 1. Histogram of Weekly Deployed individuals. The horizontal axis shows the number of people deployed within a week. The vertical axis, on a log scale, shows the number of weeks of historical data that had a similar deployment size.
Figure 2. Histogram of days between an event and a person's deployment, plotted on a log scale. Note that there are a few dozen individuals deployed before an event.
Figure 3. Histogram of days a person is deployed, plotted on a log scale.
Now, when we actually perform a Monte Carlo simulation, we choose an amount from the probability distribution of each of the choices. Our three choices are: Deployments needed weekly (count), and for each person deployed, how many days until deployed following an event and how many days deployed. So far, we have assumed the number of days before and during deployment are independent, meaning there is no relationship between the two. This turns out to be true, lucky for us, and will be discussed in more depth in the third post of this series.
Additionally, we break out the number of deployments needed for events starting in a week by each quarter. This way seasonal weekly needs, such as deployments for hurricanes, can be more accurately captured.
Each Monte Carlo simulation iteration proceeds as follows:
Randomly choose a number of deployments within a given week. We choose from the 5% to 95% range to give us a good baseline view without including very high or low outliers.
For each deployment, randomly choose weeks before deployment and deployed weeks. Similar to (1), we choose from a 5% to 95% range.
Count the number of people deployed in a week
This, after iterating a specified number of times (we use 30,000), end the iterations and aggregate across the simulated deployment cohorts to establish expected outcomes.
We can cut the simulation time period down to a specific time period, such as a simulated year, or we can let a time period play out (decay) to the end after we stop taking draws. The first approach can answer questions related to expected new costs or resource needs in a fixed cycle, such as for budgeting purposes. The second approach helps answer questions around total expected resource needs.
An example of how this can be created is shown in Figure 4. This single Monte Carlo iteration combines 4 iterations together (208 weeks, 4 years worth), then allows the draws to fully exhaust following four years. Note that in this example, there is an extreme week near the beginning of year 3 (150 deployed weeks) that causes a large increase in deployments.
Figure 4. Single example Monte Carlo draw. Concurrent deployments per week are shown in the Y axis. Coloring is defined by the total number of deployments per quarter of events, with deep green being the highest values.
Where to next?
Even though we have a "result" in Figure 4, we should be careful to not draw conclusions based on a single iteration. Instead, we will want to repeat the process many, many times.
Stay tuned for the next post, which shows the conclusion of the Monte Carlo simulation, where this can be extended, and limitations to consider!
By Tom Roderick, PhD from Flamelit
Comments