I was bored so I tried to see if I can come up with a better system for predicting medals. Places like GraceNote essentially take the top three contenders and give them gold, silver and bronze. That tends to cause situations where a nation with a 50% chance to win say 12 events would be predicted to win 12 medals, but in reality they would only win six. Usually this is masked by the overall medal count, but it becomes apparent if you look at the details for individual sports.
Currently my model is quite early in development, it doesn't take into account the maximum athletes per nation in a specific event or the chances of an athlete to fail (DNS/DSQ/DNF etc). It also only currently works for events where you can obtain a score/time/measurement (placing/head to head events will have to wait).
Here's a sample of the men's 100m (top 5) that I was using as my test data.
Men's 100m
Gold
- 76.4%
- 17.8%
- 1.9%
- 1.5%
- 0.9%
Silver
- 60.6%
- 17.8%
- 5.8%
- 4.4%
- 4.2%
Bronze
- 46.0%
- 14.7%
- 9.9%
- 7.7%
- 6.3%
As mentioned above, the United States is currently a bit inflated due to more than three athletes contributing to its chances (removing them manually and redistributing it would give them 73.3%, 53.5% and 38.6% chances respectively).