All data originated from Pro Football Reference and nflFastR.
So, the model in week 1 did not do so well. To perform like it did against historical data and have any chance of outperforming Vegas, I need to predict between 10 and 11 winners a week. Last week, I only got 8 right, which means I might as well have flipped a coin. The good news is that on average, my point spreads were closer to actual outcomes than the Vegas spreads, so I’m not too worried just yet.
I actually made some last minute updates to the model before the season started which caused it to change two game winners: Atlanta over Seattle and Cincinnati over the Chargers. My gut told me those were wrong calls, but I’m testing the model not my instincts. Still, had I not changed anything I would have gotten my 10 wins. It happens.
The changes I made to the model were to revise my rolling average methodology. If you recall, I am using a historical rolling average to determine the predicted efficiency numbers for each team’s upcoming game. That math was a simple average, equally weighting the weeks included. My original intent was to give higher weights to more recent weeks, but I couldn’t figure out a good way to code for that.
Eventually, I managed to cobble some code together and I tested 4 additional weighting algorithms: linear, exponential, logarithmic and sigmoidal.
This chart is an example using a 20 week look-back and the black line (evenly weighted) is what my first model used. I thought that any of the other weight curves should perform better as recent weeks’ performance is more likely descriptive of a team’s ability than older games. And I was right.
All of those weighting schemes improved model performance and in the end, I used the linear weight curve (red line) using a 19 week trailing period. Here is a performance comparison of my original model against the revised one.
The linear weighting not only has a higher peak (just 0.3% behind Vegas), but it also is a much smoother curve. Sudden spikes in the curves are likely false patterns in the data and not repeatable outside the model. A smoother curve should be more reliable.
Another change I made, was to use QB trailing efficiency instead of team passing efficiency for teams that saw a QB change in 2020. This includes 4 teams that signed new QBs (Colts, Bucs, Pats and Panthers) as well as 2 teams that had QBs returning from injuries (Steelers, Lions). For example, for the Colts, I used Philip River’s last 19 games as opposed to the Colts last 19 games (Brissett/Hoyer/Luck).
Joe Burrow obviously has no history and Teddy Bridgewaters’ last 19 games were 5 years ago, so the Bengals and the Chargers just used historical team passing data.
With week 1 in the books, I updated the data, calculated new rolling averages and cranked out my week 2 predictions. This probably won’t get published before the Thursday night game, so you’ll just have to trust me on that one.
As of now, Vegas and I agree on 15 out of 16 predicted winners with the sole difference being my model predicting New England to beat Seattle. I don’t know why my model hates the Seahawks so much, but I’ll have to re-check the math because again, I think this is the wrong call.
The spread differentials between my model and Vegas are quite small averaging about 1.6 absolute points. That’s really crazy when you think about it.
Tracking just the Colts season, I am now 0 – 1 (as are most people I presume). The following table shows 3 different season predictions using varying inputs of passing efficiency that I described in my week 1 article. This is basically a low, medium and, high forecast alongside my actual predictions.
The low scenario assumes Colts 2020 passing will have the same efficiency as 19 weeks prior to the season. The medium scenario assumes 2020 will look like Rivers’ 19 weeks prior to 2020 and the high scenario assumes Rivers puts up his 2017 – 2019 averages.
The medium and high views predict 14 wins, which is obviously far too high and would likely be different if I was predicting those week by week. When predicting wins from a static point at the beginning of the season it is best to use the expected win probability totals, which are a much more realistic 9.6 and 10.6 games.
The last pair of columns are the numbers I predicted in week 1 and also my predictions for the rest of the season with updated information from week 1. Again, 13 wins is not a realistic number, but the 9.4 expected wins seems about right to me, given what I saw Sunday. We’ll see how this tracks.