Riddler Express – 11/16/2018

The latest Riddler is out. I worked on the express puzzle this week.

The Riddler Express was related to the current World Chess Championship, which is in progress. The setup is that there is one player who is slightly stronger than the other, so that the stronger player wins 20% of the time, the players draw 65% of their games, and the weaker player wins 15% of their contests. The two players face-off in a championship of 12 games where a win counts for one point and a tie counts for a half point each.

The Solution

The first question is to determine the probability that the stronger player wins over the course of the 12 matches. The answer is 0.5195. This is calculated by using a normal distribution,  for the number of points won, with mean 6.30 and standard deviation 1.02. The probability of the stronger player winning the whole competition is the same as getting a score higher than 6.25 in the above normal distribution. I calculate the probability using Excel.

The mean and standard deviation for the above normal distribution are calculated using the mean and standard deviation of the random variable for the points from one game. The calculations for both of those are below.

$\mu_1 = 0.2*1 + 0.65.5 + 0.15* 0 = 0.525$

$\sigma_1 = \sqrt{0.2*1^2 + 0.65*0.65^2 + 0.15*0^2 - 0.525^2} = 0.294$

The mean and standard deviation for the points from twelve games comes from the following calculations.

$\mu_{12} = 12 * \mu_1 = 6.3$ and $\sigma_{12} = \sqrt{ 12 * \sigma_1^2} = 1.02$

(I think that the better way to do the calculation is to use a t-distribution with 11 degrees of freedom. When I do that, I get the probability of victory  to be 0.5191.)

The second part of the problem asked how many games are required for the stronger player to win 75%, 90%, or 99% percent of the championships. The answers are 83 (82.02), 248 (247.89), 773 (772.12) games.

To solve this problem, I started by calculating the mean and standard deviation of the point distribution for n games to be as follows.

$\mu_n = n \cdot \mu_1$ and $\sigma_n = \sqrt{n \cdot \sigma_1^2}$

For a given desired probability of victory, $p$, set $z_p$ to be the z-score in the standard normal distribution associated with $P(z < z_p) = p$. The required number of points to win is $v_n = \frac{n+1}{2}$. However, to account for a continuity correction, the proper number of points for the normal distribution is $x_n = \frac{2n+1}{4}$.

To assure that the stronger player wins total victory with probability at least $p$, the following inequality must be true.

$\frac{x_n - \mu_n}{\sigma_n} > z_p$

After substituting for the definitions of $x_n$, $\mu_n$, and $\sigma_n$, we get the following.

$\frac{\frac{2n+1}{4} - 0.525n}{\sqrt{n \sigma_1^2}} > z_p$

With lots of tedious algebra, we arrive at the solution below.

$n > 10 + 800 z_p^2 \sigma_1^2 + 100 z_p \sigma_1 \sqrt{1.6 + 64 z_p^2 \sigma_1^2}$

or

$n > 10 + 69.5z_p^2 + 29.5 z_p \sqrt{ 1.6 + 5.56z_p^2}$

Substituting $z_{0.75} = 0.67449$, $z_{0.9} = 1.28155$, and $z_{0.99} = 2.32635$ gives number of games listed at the beginning.

Sanity Check

While I am confident in my answers based on experiments in Excel, I wanted to do a sanity check to make sure that if I was wrong, then I was at least close in my answers. To do this check, I thought it would be fun to experiment with the statistics program R.

The check involved simulating 25,000 championships each of varying numbers of games. I then compared the mean, standard deviation, and probability of winning to my predicted values. The predictions were spot on.

The final plot from my experiments is below.

The blue circles are the probability of victory from the simulation, and the red curve is the probability of winning based on my calculations above.

The source for creating the graph is below. There is some vestigial code from some other tests I performed in checking the correctness of my statistical calculations from above.

A Quick Note on the R Learning Curve

I am very impressed by how easy R is to learn. I went from downloading R and RStudio to  producing this graph in about 3 hours. Most of the credit goes to this video.