Main index About J. D. A. Wiseman

The Chow & Robbins Problem: Stop at h=5 t=3

Julian D. A. Wiseman

Contents: Introduction, Whether to stop at h=5 t=3, Failure to find a better estimate of the value of the game, Afterword.


Publication history: only here. Usual disclaimer and copyright terms apply.


1. The Chow & Robbins problem asks for the fair value of a game, in which tosses a coin, at least once, until choosing to stop, when one scores the proportion of tosses that are heads. An essay of late 2005, The expected value of sn/n ≈ 0.79295350640…, improved the best known estimate of the value of the game to ≈11½ decimal places. But it did not address any outstanding questions about possible stopping states.

Luis A. Medina & Doron Zeilberger, in ‘An Experimental Mathematics Perspective on the Old, and still Open, Question of When To Stop?’, arXiv:0907.0032v1, ask:

If currently you have five heads and three tails, should you stop?

If you stop, you can definitely collect 5/8 = 0.625, whereas if you keep going, your expected gain is > 0.6235, but no one currently knows to prove that it would not eventually exceeds 5/8 (even though this seems very unlikely, judging by numerical heuristics).

The author provides strong evidence that at h=5 t=4 the value of the game is ≈ 0.580572488…, and hence the value from continuing from h=5 t=3 is the average of this and 6÷9, that average being ≈ 0.623619577, which is significantly less than 0.625. Further, it seems likely that the error in the estimate of EV(h=5, t=4) is less than 10–10, so h=5 t=3 really is a stopping state. These further heuristics are consistent with the existing “numerical heuristics” cited by Medina & Zeilberger.

2. The previous code was constrained by memory. Code has been rewritten to remove this constraint. Extending the calculation by a factor of 4 has not improved the accuracy to which the value of the game is known. Current constraints definitely include CPU time, and also perhaps the accuracy of the hardware’s floating-point arithmetic.

1. Whether to stop at h=5 t=3

With five heads and three tails, should one stop (scoring ⅝ = 0.625), or should one continue? Rephrased, is 0.625 ≥ ½(EV(6,3) + EV(5,4))? For n as large as has been tested, h=5 t=3 is a stopping state, so it is safe to assume that, with an extra head, so at h=6 t=3, one should definitely stop, there scoring 6/(6+3) = ⅔ ≈ 0.666666. So the question is equivalent to asking whether EV(5,4) ≤ 712 ≈ 0.583333.

The same technique previously used for the value of the whole game can be used to value the game at h=5 t=4. Results are output in the table that follows (new code available from; calculations run on a 1.83 GHz Intel Core 2 Duo processor under Mac OSX 10.6.4 using Xcode 3.2.3).

in = 2iEVn(h=5, t=4)Diff EVRatio
Diff EV
Estimated limit =
EV + Diff /
(1Ratio – 1)
Estimated limit
limit × 2i

This table resembles those in the 2005 essay, columns having meanings as follows.

  1. The first column contains i, a row counter,
  2. and the second contains n = 2i, being max_t;
  3. The third contains the computed value of EVn(5,4), to fifteen decimal places. These increase with i, as expected.
  4. The fourth column shows the increase between adjacent rows, these increases being positive, but decreasing.
  5. The fifth column has the ratio between successive increases. This appears to be tending to a constant limit, unsurprisingly very close to the limit of the ratios for EVn(0,0) shown in the second table.
  6. If this ratio does indeed reach a constant limit, then the limiting value can be estimated as the current bound, plus the increase divided by one less than the reciprocal of the ratio, which is shown in the sixth column. This appears to be tending to 0.580572488…, the next digit probably being 4 or 5.
  7. The seventh column contains the differences between the estimates of the limit, which are of generally decreasing magnitude but non-constant sign.
  8. The eighth shows the same difference, × n. (Reminder: n = 2i.) The absolute size of the changes in the estimates of the limit are a small and decreasing proportion of 1n.

More on the eighth column. The absolute differences between consecutive estimates of the limit are a (mostly) decreasing proportion of 2i. Let’s take them at their largest: assume that they are all positive, and that they are all equal to 2i. This assumes to unity the coefficient which is already <10–3, and generally of decreasing magnitude, with non-constant sign. Then these assumed future changes, over all not-yet-computed rows, total only 2(–nmax). Despite these overestimates of the possible error, this total is much smaller than the gap between 712 and the best estimate of the limit.

Number of ‘standard deviations’ to reach 7/12

As an alternative, let us assume that the changes in the estimate of the limit are happening pseudo-randomly, changes being centred on 0, and with standard deviation something the absolute value of the change in the estimate of the limit. (It doesn’t matter if this doesn’t hold for all rows: for some rows suffices.) Also assume that this will diminish by only a factor of 2 with each doubling of n, even though the eighth column suggests that the diminishment is slightly faster. Hence, summed over all future rows, the standard deviation of remaining distance of travel is of a similar order of magnitude to this most recent change. The chart on the right shows the distance to 712 as a multiple of this assumed σ. And behold! The distance is a huge and increasing number of standard deviations. (A one-billion standard deviation move happens, in a normal world, one time in ≈102×1018.) So, relative to the distance that needs to be travelled the changes in the estimated limit are just too small and diminishing too fast: it just isn’t happening.

Note that, for i ≥12, (712–Estimated limit) ≈ 0.002761, so this chart quickly becomes proportional to the reciprocal of the change in the estimated limit. In some sense, the whole point is the immobility of (712–Estimated limit) relative to the magnitude of changes in EVn(5,4).

2. Failure to find a better estimate of the value of the game

When executing the 2005 version of the code, memory was a slighter tighter constraint than CPU time. This problem was fixable, as then acknowledged:

It is possible to rewrite the program such that the memory usage is linear in the widest gap between the red and grey lines (rather than being linear in n), albeit at the price of a loss of readability. This recoding has not yet been attempted.

This problem has been resolved in the new code version, reducing memory usage by a large factor. The following table shows results for n = 2i.

in = 2iEVn(0,0)Diff EVRatio
Diff EV
Estimated limit =
EV + Diff /
(1Ratio – 1)
max j
Approx t
at which j

The first six columns have meaning similar to those in the h=5 t=3 table.

  1. For each t, calculations are performed only over ‘useful’ h. The seventh column shows, for some n, an approximation to the largest number of useful h, that is, the largest value of j.
  2. The eighth column shows, again approximately, the t with this maximal j.

These columns might be of interest to those attempting further optimisations of the code.

Alas the results have not improved the estimate of the expected value of the whole game. The author also suspects that the accuracy of the C double might be insufficient, or nearly so.


With five heads and three tails, stop, thereby scoring 0.625. Choosing to continue lowers the expected payoff to 0.62361957757…, with the next digit probably being close to ‘3’.

Julian D. A. Wiseman
September 2010


In January 2012 this page was cited (albeit with a punctuation problem in the URL): Olle Häggström and Johan Wästlund, ‘Rigorous computer analysis of the Chow-Robbins game’, arXiv:1201.0626v1 leading to PDF.

Main index Top About J. D. A. Wiseman