Skip to content
Portada » News » Test SPRT revolution 2.0 versus baseline

Test SPRT revolution 2.0 versus baseline

revolution 2.0

Test Summary

  • Time control: 10+0.1 (rapid with small increment)
  • Engines tested: Revolution (baseline) vs Revolution1 (new version)
  • Games played: 2178
  • Book: UHO_2024_8mvs_+085_+094.pgn (balanced/unbiased openings, up to 8 moves)
  • Hash: 32 MB (kept deliberately small for throughput)
  • Threads: 1

✅ Interpreting Results

  • Score %: 47.54% for Revolution (baseline) → means Revolution1 scored higher (52.46%).
  • What difference:
    • -17.08 ± 7.97 → Revolution1 is about +17 Elo stronger (statistically).
    • nElo -31.33 ± 14.59 → normalized Elo suggests an even wider gap, ~+31 Elo for Revolution1.
  • LOS (Likelihood of Superiority): 0.00% → from the chosen perspective, Revolution did not outperform; Revolution1 was consistently better.
  • Draw ratio: 49.59% — fairly typical for this TC and book.
  • Win/Loss ratio: 483W / 590L → Revolution1 wins ~22% more decisive games than it loses.
  • Pentanomial vector: [26, 294, 540, 219, 10] → distribution of outcomes by colour and result.
  • LLR (Log-Likelihood Ratio): -1.16 → indicates the SPRT test would reject the hypothesis that Revolution is stronger; in fact, Revolution1 is favoured.

🧾 Bottom Line

  • Revolution1 gained around +20 to +30 Elo over Revolution in this test set.
  • With 2178 games, the confidence interval is already quite tight (±8 Elo).
  • The LOS of 0% means we can confidently say Revolution1 is not weaker—it is stronger.

I’ll use the standard rule that the Elo uncertainty (σ) falls as1/N1/\sqrt{N}1/N​maintaining the same TC, book and draw rate.

Current data:

  • N0=2178N_0 = 2178N0​=2178matches
  • Error actual: s0=7.97\sigma_0 = 7.97s0​=7.97How much

Projection formula: Naim=N0(s0saim)2N_{\text{objetivo}} = N_0 \left(\frac{\sigma_0}{\sigma_{\text{objetivo}}}\right)^2Naim​=N0​(saim​s0​​)2

Case A) You want ±5 Elo as 1σ

NA=2178(7.975)2≈5534 partidasN_A = 2178 \left(\frac{7.97}{5}\right)^2 \approx \mathbf{5534\ partidas}NA​=2178(57.97​)2≈5534 matches

Additional items ≈ 5534 − 2178 = 3356 .

Case B) You want ±5 Elo at 95%

(95% ≈ ±1.96s \pm 1.96\sigma± 1.96 s ⇒ saim=5/1.96≈2.55\sigma_{\text{objective}} = 5/1.96 \approx 2.55saim​=5/1.96≈2.55) NB=2178(7.972.55)2≈21259 partidasN_B = 2178 \left(\frac{7.97}{2.55}\right)^2 \approx \mathbf{21259\ partidas}NB​=2178(2.557.97​)2≈21259 matches

Additional items ≈ 21259 − 2178 = 19081 .


Quick Notes

  • These figures assume the same conditions (10+0.1, 1 thread, 32 MB, same book and draw ratio ~50%).
  • If you raise the draw rate, the effective variance may increase a little, but the rule of thumb is1/N1/\sqrt{N}1/N​It is still a good guide for planning the stop.

Download Revolution 2.0

Leave a Reply

Your email address will not be published. Required fields are marked *

Share via