SustainableDataCenterControl
Task 46 / 47
hand_written_control
A SustainDC joint-control benchmark coordinating load shifting, cooling, and battery dispatch to reduce energy cost or carbon while meeting SLA-style constraints. Large coupled state and multi-objective trade-offs reflect real sustainable datacenter operations, evaluated through the unified pipeline.
Model leaderboard
| # | Participant | Score |
|---|---|---|
| 1 | Qwen3 Coder Next | 100.0 |
| 2 | SEED 2.0 Pro | 95.8 |
| 3 | Claude Opus 4.6 | 60.1 |
| 4 | GLM-5 | 51.0 |
| 5 | DeepSeek V3.2 | 31.0 |
| 6 | Grok 4.20 | 26.2 |
| 7 | Gemini 3.1 Pro Preview | 20.0 |
| 8 | GPT-5.4 | 0.0 |
Framework leaderboard
| # | Participant | Score |
|---|---|---|
| 1 | GPT-OSS + ShinkaiEvolve | 100.0 |
| 2 | GPT-OSS + ABMCTS | 95.3 |
| 3 | Claude Opus 4.6 + ShinkaiEvolve | 93.1 |
| 4 | Claude Opus 4.6 + ABMCTS | 87.7 |
| 5 | Claude Opus 4.6 + OpenEvolve | 59.4 |
| 6 | GPT-OSS + OpenEvolve | 0.0 |
Score is the normalized score for this task (0–100, higher is better).