Blitz Chess Engines Rating List — Methodology & Replication
This Blitz list is built from reproducible round-robin tournaments on identical hardware, with fixed time control, uniform openings, and strict per-engine limits. Ratings are computed with BayesElo and cross-checked with Ordo to ensure stability and consistent error bars.
1) Testbed (hardware, OS, baselines)
- Host: HP ProLiant DL360p Gen8
- OS: Windows 10 Pro Workstation (no foreground load during runs)
- Tablebases: Syzygy 5-piece (
-tb C:\Syzygy -tbpieces 5
) - Instruction set baseline: SSE3/SSE4/POPCNT builds when available
- Threads & Hash: Threads=1, Hash=16 for every engine (strict fairness)
- Parallelism:
-concurrency 2
(two games in parallel, same host)
Fixing Threads, Hash, TB depth, and ISA baseline removes common noise sources and makes runs comparable across seasons.
2) Tournament format (CuteChess-CLI)
We run round-robin, color-balanced schedules with multiple passes:
- Mode:
-tournament round-robin
- Time control (Blitz division):
tc=120+0.1
(120 seconds + 0.1s increment) - Openings: UHO 8-move suites (see below)
- Color balance:
-games 2
(one as White, one as Black) - Depth:
-rounds 7 -repeat 2
→ multiple cycles for variance reduction - Recovery:
-recover
(safe resume after interruptions) - Exports:
-pgnout ..\Games\nnue_league_2025.pgn
and-ratinginterval 10
Exact command (from our Blitz CMD):
@echo off
set opening=UHO_2024_8mvs_big_+100_+119
@cutechess-cli.exe ^
-event "Cup season Ijcrl 2025" -site "HP Proliant DL360P Gen8 Server" ^
-engine conf="stockfish 17.1" ^
-engine conf="berserk_20250606_64_ja_sse4" ^
-engine conf="caissa-1.23-x64-sse4-popcnt" ^
-engine conf="dragon-3.3-64bit" ^
-engine conf="Ethereal-14.25-ssse3" ^
-engine conf="Clover.9.0-old" ^
-engine conf="RubiChess-20240817_x86-64-sse3-popcnt" ^
-engine conf="stormphrax_700_64_ja_sse4" ^
-engine conf="Obsidian 16.14" ^
-tournament round-robin ^
-each tc=120+0.1 option.Hash=16 option.Threads=1 -tb C:\Syzygy -tbpieces 5 ^
-openings file=..\Openings\PGN\%opening%.pgn format=pgn order=random ^
-concurrency 2 -rounds 7 -games 2 ^
-repeat 2 -recover ^
-pgnout ..\Games\nnue_league_2025.pgn ^
-ratinginterval 10
pause
We include the most relevant engines in current rating ecosystems (Stockfish, Berserk, Dragon, Ethereal, etc.) to keep the Blitz table meaningful for practitioners.
3) Openings policy (UHO)
We use UHO 8-move books to guarantee broad, repeatable coverage and minimize opening bias:
- Example:
UHO_2024_8mvs_big_+100_+119.pgn
- Invocation:
-openings file=… format=pgn order=random
(randomizes order, not content)
4) Rating pipeline (BayesElo + Ordo)
We compute ratings using two independent tools:
4.1 BayesElo (primary Elo)
We load the PGN produced by CuteChess and run the standard maximum-likelihood loop. A minimal, reproducible BayesElo session looks like:
bayeselo
readpgn nnue_league_2025.pgn
elo
mm
ratings
x
readpgn
: imports all resultselo
→mm
: runs the ML iteration to convergenceratings
: prints the rating table with error bars
We publish Elo ± error, games, and sometimes LOS for convenience. We do not mix divisions: Blitz results remain in Blitz.
4.2 Ordo (cross-check)
We also process the same PGN in Ordo to validate relative placement and error magnitudes. Ordo’s logistic model and color-bias correction help confirm that standings are stable. (We keep Ordo’s config aligned with our BayesElo assumptions; details available on request for replicators.)
Using both BayesElo and Ordo reduces the chance that a modeling quirk in one tool skews the Blitz standings.
5) Data outputs & downloads
- PGNs: all tournament games are exported to a season PGN (
nnue_league_2025.pgn
). - Organization: games are stored inside our container, sorted by year/month for traceability.
- Leaderboard: the live Blitz table is maintained here: /blitz-chess-engines-rating-list/.
We periodically publish CSV/JSON snapshots so you can filter by engine/version and reproduce Elo externally.
6) Quality controls
- Identical constraints per engine: Threads=1, Hash=16, same TB depth, same UHO file.
- Color balance and multi-pass scheduling cut variance at Blitz speeds.
- No special book/flags per engine beyond the global constraints above.
- Crash-safe:
-recover
avoids data loss mid-event.
Snapshot (illustrative example)
The live, authoritative standings are on the page’s main table. The following is a compact example (format and columns mirror our published table):
Columns: Elo is BayesElo (primary), error is the 1-σ margin; LOS shown when available; Ptnml = pentanomial distribution counts for WL/DD buckets (helpful for SPRT analyses in other contexts).
Reproduce our Blitz list on your machine (summary)
- Run CuteChess-CLI with the command shown above (exact TC, openings, Threads/Hash, TB depth).
- Compute ratings:
- BayesElo:
readpgn → elo → mm → ratings → x
- Ordo: process the same PGN to cross-check placements and error bars.
- BayesElo:
- Compare your Elo table with ours; differences should be within error margins provided engines, nets, TC, openings, and constraints are identical.
got it — I removed any mention of Revolution and prepared exactly what you asked:
1) Tiny FAQ block (with JSON-LD)
2) Downloadable CSV/JSON templates (machine-readable)
I generated both files for your Blitz page :
Columns: rank, engine, version, tc, games, elo, error, los, ptnml, pgn_url
.
They’re filled with illustrative rows for original UCI engines only (e.g., Stockfish, Berserk, Ethereal, Dragon/Komodo, RubiChess, Koivisto, Igel, Lc0, Seer). Replace values with your BayesElo/Ordo outputs as you publish updates.
070925
180s+1s 070925
Name | Elo | games | score | Draws |
---|---|---|---|---|
stockfish 17.1 | 3833 | 560 | 69% | 39% |
Obsidian 16.14 | 3770 | 560 | 60% | 35% |
berserk_20250606 | 3758 | 560 | 58% | 40% |
caissa-1.23 | 3704 | 560 | 49% | 40% |
Clover.9.0-old | 3700 | 560 | 49% | 40% |
stormphrax_700 | 3686 | 560 | 47% | 39% |
dragon-3.3-64bit | 3679 | 560 | 46% | 39% |
RubiChess-20240817 | 3620 | 560 | 37% | 44% |
Ethereal-14.25 (NNUE) | 3620 | 560 | 37% | 38% |