Skip to content
Portada » News » revolution 1.0.1 dev vs revolution 1.0

revolution 1.0.1 dev vs revolution 1.0

revolution 1.0.1 dev vs revolution 1.0 — SPRT Test Summary

Date: 31 August 2025 · Time control: 10+0.1 · Threads: 1 · Hash: 32 MB · Openings: UHO_2024_8mvs_+085_+094.pgn

We ran 1,792 games comparing Wordfish 1.0 dev against the baseline Wordfish 1.0. The development build scored 48.44% with an Elo difference of –10.9 ± 8.8 and a Likelihood of Superiority (LOS) of 0.79%. These figures indicate no Elo gain; in fact, a modest loss under the tested conditions.

revolution 1.0.1 dev
Figure 1. Elo distribution (dev − baseline). The mean (red) and ±1σ band show a clear negative shift below zero.
Figure 2. Likelihood of Superiority (LOS). At 0.79%, the dev build is very unlikely to be stronger.

Key numbers

  • Games: 1,792
  • Score: 48.44% (868.0 / 1,792)
  • Elo: –10.9 ± 8.8
  • nElo: –19.8 ± 16.1
  • LOS: 0.79%
  • Draw ratio: 49.9%

Conclusion

Under these conditions, Wordfish 1.0 dev is approximately 11 Elo weaker than the baseline Wordfish 1.0. The LOS well below 50% supports rejecting the change if Elo strength is the sole criterion.

Jorge Ruiz

Jorge Ruiz

connoisseur of both chess and anthropology, a combination that reflects his deep intellectual curiosity and passion for understanding both the art of strategic. Chess books

Leave a Reply

Your email address will not be published. Required fields are marked *

Share via