ReactionOptimisation
Task 34 / 47
mit_case1_mixed
Mixed-variable reaction yield optimization from the MIT_case1 setting: continuous process variables plus a categorical catalyst choice. It stresses black-box optimization with discrete decisions, evaluated via the benchmark's unified hook into SUMMIT-style verification—common in digital-twin reaction tuning.
Model leaderboard
| # | Participant | Score |
|---|---|---|
| 1 | GPT-5.4 | 100.0 |
| 2 | Claude Opus 4.6 | 100.0 |
| 3 | DeepSeek V3.2 | 99.5 |
| 4 | Gemini 3.1 Pro Preview | 81.3 |
| 5 | GLM-5 | 75.9 |
| 6 | SEED 2.0 Pro | 71.5 |
| 7 | Qwen3 Coder Next | 71.0 |
| 8 | Grok 4.20 | 0.0 |
Framework leaderboard
| # | Participant | Score |
|---|---|---|
| 1 | GPT-OSS + ShinkaiEvolve | 100.0 |
| 2 | GPT-OSS + ABMCTS | 81.7 |
| 3 | Claude Opus 4.6 + OpenEvolve | 0.0 |
| 4 | Claude Opus 4.6 + ShinkaiEvolve | 0.0 |
| 5 | Claude Opus 4.6 + ABMCTS | 0.0 |
| 6 | GPT-OSS + OpenEvolve | 0.0 |
Score is the normalized score for this task (0–100, higher is better).