Test Summary
- Time control: 10+0.1 (rapid with small increment)
- Engines tested: Revolution (baseline) vs Revolution1 (new version)
- Games played: 2178
- Book:
UHO_2024_8mvs_+085_+094.pgn
(balanced/unbiased openings, up to 8 moves) - Hash: 32 MB (kept deliberately small for throughput)
- Threads: 1
✅ Interpreting Results
- Score %: 47.54% for Revolution (baseline) → means Revolution1 scored higher (52.46%).
- What difference:
- -17.08 ± 7.97 → Revolution1 is about +17 Elo stronger (statistically).
- nElo -31.33 ± 14.59 → normalized Elo suggests an even wider gap, ~+31 Elo for Revolution1.
- LOS (Likelihood of Superiority): 0.00% → from the chosen perspective, Revolution did not outperform; Revolution1 was consistently better.
- Draw ratio: 49.59% — fairly typical for this TC and book.
- Win/Loss ratio: 483W / 590L → Revolution1 wins ~22% more decisive games than it loses.
- Pentanomial vector:
[26, 294, 540, 219, 10]
→ distribution of outcomes by colour and result. - LLR (Log-Likelihood Ratio): -1.16 → indicates the SPRT test would reject the hypothesis that Revolution is stronger; in fact, Revolution1 is favoured.
🧾 Bottom Line
- Revolution1 gained around +20 to +30 Elo over Revolution in this test set.
- With 2178 games, the confidence interval is already quite tight (±8 Elo).
- The LOS of 0% means we can confidently say Revolution1 is not weaker—it is stronger.
I’ll use the standard rule that the Elo uncertainty (σ) falls as1/N1/\sqrt{N}1/Nmaintaining the same TC, book and draw rate.
Current data:
- N0=2178N_0 = 2178N0=2178matches
- Error actual: s0=7.97\sigma_0 = 7.97s0=7.97How much
Projection formula: Naim=N0(s0saim)2N_{\text{objetivo}} = N_0 \left(\frac{\sigma_0}{\sigma_{\text{objetivo}}}\right)^2Naim=N0(saims0)2
Case A) You want ±5 Elo as 1σ
NA=2178(7.975)2≈5534 partidasN_A = 2178 \left(\frac{7.97}{5}\right)^2 \approx \mathbf{5534\ partidas}NA=2178(57.97)2≈5534 matches
Additional items ≈ 5534 − 2178 = 3356 .
Case B) You want ±5 Elo at 95%
(95% ≈ ±1.96s \pm 1.96\sigma± 1.96 s ⇒ saim=5/1.96≈2.55\sigma_{\text{objective}} = 5/1.96 \approx 2.55saim=5/1.96≈2.55) NB=2178(7.972.55)2≈21259 partidasN_B = 2178 \left(\frac{7.97}{2.55}\right)^2 \approx \mathbf{21259\ partidas}NB=2178(2.557.97)2≈21259 matches
Additional items ≈ 21259 − 2178 = 19081 .
Quick Notes
- These figures assume the same conditions (10+0.1, 1 thread, 32 MB, same book and draw ratio ~50%).
- If you raise the draw rate, the effective variance may increase a little, but the rule of thumb is1/N1/\sqrt{N}1/NIt is still a good guide for planning the stop.