Revolution vs HypnoS at 60s+2.0 (Fritz 20, 4 Threads, 512 MB Hash)
A data-driven match report on the UHO_2022_8mvs_big_+100_+129 suite
Executive summary
This report analyses a 100-game head-to-head match between revolution v.2.30 dev-120925, avx (hereafter Revolution) and HypnoS ++ 1.03 popcnt (HypnoS) played in Fritz 20 at 60 seconds + 2.0 seconds increment (60s+2.0), with 4 threads per engine and 512 MB hash for both engines. Openings were drawn from UHO_2022_8mvs_big_+100_+129, with colour-reversed pairs so both engines faced the same line with both colours.
Top-line result: Revolution scored 55.0/100 (55%) to HypnoS’s 45.0/100 (45%). In Elo terms that translates to +35 Elo by ORDO and +19 Elo by BayesElo on this dataset—an edge, but not a commanding one, given the statistical error bars at this sample size.
Crucially, because the UHO suite used was “big” (+100 to +129 cp for White after 8 half-moves), the match exhibits a huge colour effect: White scored 90% overall, with 80 decisive white wins and 20 draws; there were no black wins in the PGN. That is expected for this particular suite and time control, and it strongly conditions how to read engine performance.
Below you’ll find (i) the full test context and methodology, (ii) results and confidence intervals, (iii) colour-split and opening-family patterns, (iv) game-length distributions, and (v) practical takeaways for users deciding between these engines at fast controls.
Test context & methodology (fairness and reproducibility)
- GUI: ChessBase Fritz 20
- Time control: 60 seconds + 2.0 seconds per move
- Threads: 4 per engine
- Hash: 512 MB per engine
- Ponder: (not specified in the PGN; analysis assumes it was off unless you state otherwise)
- Openings: UHO_2022_8mvs_big_+100_+129; this suite seeds positions after 8 plies with White advantage roughly +100 to +129 cp. This is deliberately skewed to stress conversion and defensive technique from inferior starts.
- Pairing: Colour-reversed pairs (each engine plays each opening as White and Black).
- Dataset: 100 games.
- Engines:
- Revolution:
revolution v.2.30 dev-120925, avx - HypnoS:
HypnoS ++ 1.03 popcnt
- Revolution:
- Files analysed: BayesElo/ORDO summaries and the match PGN provided. Headline Elo/points figures: BayesElo shows Revolution +19 Elo, 55% score, 100 games, 20% draws, agreeing with the ORDO table (Revolution +35.2 Elo equivalent by points 55–45).
What the numbers say
Overall scoreboard
- Revolution: 45 wins, 20 draws, 35 losses → 55.0/100
- HypnoS: 35 wins, 20 draws, 45 losses → 45.0/100
- Draw rate: 20% (20/100). BayesElo summary confirms this draw rate.
Colour split (from the PGN)
- White results: 80 wins, 20 draws, 0 losses → White scored 90%.
- Black results: 0 wins, 20 draws, 80 losses → Black scored 10%.
- Per engine by colour:
- Revolution as White: 45-0-5 (W-L-D) → 45 wins, 5 draws, 0 losses
- Revolution as Black: 0-35-15 → 15 draws, 35 losses, 0 wins
- HypnoS as White: 35-0-15 → 35 wins, 15 draws, 0 losses
- HypnoS as Black: 0-45-5 → 5 draws, 45 losses, 0 wins
This distribution is not an evaluation bug; it simply reflects the suite’s deliberate white bias at this fast control. In other words: the better engine is the one that converts more of those white edges and defends a bit better when it gets the inferior side. Revolution did both, hence the 55–45 margin.
Elo estimates & confidence
- ORDO table: Revolution 3735.2 vs HypnoS 3700.0 → +35 Elo edge by points (55–45).
- BayesElo table: Revolution +19 Elo (± about 30 Elo given the +/– in the file), 55% score, 100 games, 20% draws. This is a small but consistent advantage.
At 100 games and a 20% draw rate, the sampling error is still large. A rough z-approximation on 55/100 against a 50% null yields z≈1.0, p≈0.32—not statistically conclusive in isolation (two-sided 5% threshold). Put plainly: Revolution leads, but you should run more games if you want a hard proof beyond reasonable doubt.
How the match unfolded (visuals you can embed)
I generated a set of charts straight from your PGN to make the key patterns obvious. Use either Gutenberg “Image” blocks or HTML blocks with the snippets below (upload the PNGs to your Media Library and replace src with your URLs).
1) Cumulative score (Revolution)
- Download:
revolution_cumulative_score.png - HTML embed (Gutenberg → “Custom HTML” block):
<figure class="wp-block-image size-large" style="text-align:center">
<img src="YOUR-URL/revolution_cumulative_score.png" alt="Cumulative score: Revolution vs HypnoS (100 games, 60s+2.0)" loading="lazy" style="max-width:100%;height:auto" />
<figcaption>Revolution’s cumulative score rises steadily to 55/100.</figcaption>
</figure>
Reading the graph: The steady, near-linear incline indicates Revolution’s edge is persistent rather than dependent on one burst of wins. There’s no long plateau where HypnoS claws back parity; Revolution just keeps nudging ahead.
2) Result profile by engine (wins, draws, losses)
- Download:
results_by_engine.png - HTML embed:
<figure class="wp-block-image size-large" style="text-align:center">
<img src="YOUR-URL/results_by_engine.png" alt="Match results by engine: 100 games, wins/draws/losses" loading="lazy" style="max-width:100%;height:auto" />
<figcaption>Revolution converts more often and also holds slightly better as Black (more draws).</figcaption>
</figure>
Interpretation: Revolution’s +10 decisive results swing (45 wins vs 35) aligns with the 55–45 score; note also the draw differential as Black (Revolution 15 vs HypnoS 5), i.e., Revolution defends the worse side more often.
3) Game-length distribution
- Download:
game_length_histogram.png - HTML embed:
<figure class="wp-block-image size-large" style="text-align:center">
<img src="YOUR-URL/game_length_histogram.png" alt="Histogram of game lengths in moves" loading="lazy" style="max-width:100%;height:auto" />
<figcaption>Most games fall between ~60 and ~120 moves; median = 88 moves.</figcaption>
</figure>
Stats: Mean 92.8 moves, Median 88, Min 41, Max 250 (yes—there are a few very long grinds). The fast base plus increment (60s+2.0) and the suite’s White-favoured starts promote lengthy technical conversions and drawn-out defences.
4) Opening families (ECO) — frequency and white score
- Downloads:
opening_family_frequency.png
white_score_by_opening_family.png - HTML embeds:
<figure class="wp-block-image size-large" style="text-align:center">
<img src="YOUR-URL/opening_family_frequency.png" alt="Opening family frequency (ECO A–E)" loading="lazy" style="max-width:100%;height:auto" />
<figcaption>Most games fall in ECO B (Semi-Open) and ECO A (Flank/English/Benoni/Old Indian families).</figcaption>
</figure>
<figure class="wp-block-image size-large" style="text-align:center">
<img src="YOUR-URL/white_score_by_opening_family.png" alt="White score by opening family" loading="lazy" style="max-width:100%;height:auto" />
<figcaption>White dominated across families, consistent with the suite’s +100 to +129 cp start.</figcaption>
</figure>
Notes: ECO B was the most frequent bucket here, followed by A and C. White’s scoring is high across all buckets—no surprise given the openings—but be cautious with small samples (e.g., ECO D/E in this run).
Tables you can paste into Gutenberg
I prepared two responsive tables that match your site’s compact style. Paste them into a “Custom HTML” block. You can reuse the CSS class for any future tables.
- Download the ready-made HTML fragment for the main summary:
match_summary_table.html - Download the ready-made HTML fragment for the top ECO table:
top_eco_table.html - CSV exports (if you prefer to post-process):
match_summary.csv
colour_breakdown.csv
top_eco_results.csv
A) Match summary table (drop into a Gutenberg HTML block)
| Engine | Games | Wins | Draws | Losses | Points | Score % |
|---|---|---|---|---|---|---|
| HypnoS ++ 1.03 popcnt | 100 | 35 | 20 | 45 | 45.0 | 45.0 |
| revolution v.2.30 dev-120925, avx | 100 | 45 | 20 | 35 | 55.0 | 55.0 |
B) Top ECO lines in this run (frequency and Revolution’s score)
| ECO | Games | Revolution Points | Revolution Score % | White Score % |
|---|---|---|---|---|
| B00 | 8 | 3.5 | 43.8 | 81.2 |
| B08 | 8 | 4.0 | 50.0 | 87.5 |
| B07 | 6 | 2.5 | 41.7 | 91.7 |
| C45 | 4 | 2.0 | 50.0 | 100.0 |
| C01 | 4 | 2.5 | 62.5 | 87.5 |
| B06 | 4 | 2.5 | 62.5 | 87.5 |
| B01 | 4 | 2.5 | 62.5 | 87.5 |
| E90 | 4 | 3.0 | 75.0 | 75.0 |
| A43 | 4 | 2.0 | 50.0 | 75.0 |
| A45 | 2 | 1.0 | 50.0 | 100.0 |
| B41 | 2 | 1.0 | 50.0 | 100.0 |
| E14 | 2 | 1.0 | 50.0 | 100.0 |
(The ECO table is a descriptive snapshot of this run; small-sample rows—e.g., 2 or 4 games—should not be over-interpreted.)
Interpreting performance under a White-biased suite
Because UHO_2022_8mvs_big_+100_+129 gives White a head start (roughly a pawn), the match offers a stress test of two skills:
- Conversion: Can the engine convert good White positions reliably and efficiently at fast time controls?
- Damage limitation: Can the engine save games with Black—by building fortress-like defences, navigating endgames flawlessly, or forcing repetition?
Revolution outperformed HypnoS on both fronts in this run:
- As White: Revolution scored 45/50 = 90% (45 wins, 5 draws), while HypnoS scored 42.5/50 = 85% (35 wins, 15 draws).
- As Black: Revolution scraped 15/50 = 30% (all draws), while HypnoS achieved 5/50 = 10% (all draws).
The 10-point gap entirely matches those extra Black draws (15 vs 5) plus a modest White conversion surplus (45 vs 35 wins). If you use this particular suite to tune defensive parameters, Revolution already looks slightly better calibrated at fast time than HypnoS for resource-seeking with the inferior side.
Game length & endgame discipline
A median of 88 moves and long tails (max 250 moves) tell us two things:
- Engines needed time to convert even from a good starting eval. This is consistent with increment-driven endurance at 60s+2.0 and with imperfect tablebase access at 1-minute base (engines often nurse edges through many micro-improvements).
- Defensive tasking is meaningful: Although Black never won, 35 of Revolution’s games as Black were lost while 15 were saved; that 30% salvage rate is material at this time control and explains the final spread.
From a tuning viewpoint, consider that late-move pruning, null-move depth reductions, and contempt-like features can swing these endgame grind outcomes. If HypnoS evolves with better endgame management or selective search tweaks for short base times, it might erase this gap quickly.
Opening families and what they (don’t) prove
The ECO-bucket histograms show most games in B and A families here, with White scoring well everywhere—as expected. There is no family where Black suddenly performs well. For hypothesis generation:
- B00/B06/B07/B08 (several Pirc/Modern/Philidor/Sicilian sidelines depending on the sub-line) appear often. Revolution was roughly 44–63% within those slates, i.e., just edge-consistent rather than dominantly superior.
- E90 (King’s Indian main territory) delivered Revolution 75%, but only four games—too few for a claim.
Conclusion: The edge is broad-based, not one opening trick. The White skew of the suite drives outcomes; within that constraint, Revolution simply converts a bit better and defends a bit harder.
How strong is “+19 to +35 Elo” here?
According to the provided BayesElo summary, Revolution +19 Elo at 55% with 20% draws over 100 games; ORDO translates the same 55–45 points into +35 Elo on its scale. Both point in the same direction (Revolution ahead), but the confidence bounds overlap zero for BayesElo in this sample (±~30 Elo). Verdict: A real but modest edge is likely; if you want research-grade certainty, extend to 400–800 games or run multiple opening suites, including neutral sets (e.g., UHO_2024_8mvs_neutral).
Practical implications for your own use
- At very fast controls (bullet-ish with increment), Revolution looks slightly preferable if you value consistent conversion from a plus and stouter resource-finding as Black from inferior starts.
- HypnoS remains competitive; it trails largely due to fewer Black saves and slightly lower White conversion.
- If you mostly play balanced/neutral opening suites or longer time controls, the White bias observed here will shrink and the ranking could tighten.
- For engine room operations (e.g., gauntlets that seed many IM/GM-like micro-advantages), Revolution’s defensive draws can be meaningful rating-point savers.
Reproduce and extend
- Keep threads, hash, ponder setting, and tablebase paths strictly identical.
- Stick with colour-reversed pairs; it halves variance in small samples.
- If you want to stress defence, continue with “big” suites; if you want a general rating signal, blend neutral and mild-plus suites.
- For significance, target ≥400 games per head-to-head, or pool results across three suites (e.g., UHO neutral, UHO mild, UHO big).
Conclusion
Revolution v.2.30 dev defeated HypnoS ++ 1.03 by 55–45 at 60s+2.0 in Fritz 20 (4 threads, 512 MB hash) on the UHO_2022_8mvs_big_+100_+129 suite. On this test, Revolution is modestly stronger—+19 Elo (BayesElo) and +35 Elo (ORDO)—thanks to better White conversion and significantly better survival with Black in a suite that heavily favours White. The dataset’s 20% draw rate and 100-game size mean the lead is suggestive rather than decisive; a longer run or more neutral suites could narrow the gap. Still, if your workflow prioritises fast incremental conversions and tenacious defence, Revolution is the safer pick today on this configuration.
Graphics
game_length_histogram

opening_family_frequency

results_by_engine

revolution_cumulative_score


Jorge Ruiz
connoisseur of both chess and anthropology, a combination that reflects his deep intellectual curiosity and passion for understanding both the art of strategic chess books
