Kernel Engineering
Task 18 / 47
FlashAttention
Optimize a causal scaled dot-product attention forward kernel (FlashAttention-style) for GPU execution while matching a reference numerically. The problem stresses tiled online softmax and memory locality. Scoring reports speed and correctness for fixed shapes, representing bandwidth-bound attention kernel work in production ML stacks.
Model leaderboard
| # | Participant | Score |
|---|---|---|
| 1 | GPT-5.4 | 100.0 |
| 2 | SEED 2.0 Pro | 0.5 |
| 3 | Gemini 3.1 Pro Preview | 0.4 |
| 4 | DeepSeek V3.2 | 0.4 |
| 5 | Claude Opus 4.6 | 0.4 |
| 6 | Qwen3 Coder Next | 0.1 |
| 7 | GLM-5 | 0.0 |
| 8 | Grok 4.20 | 0.0 |
Framework leaderboard
| # | Participant | Score |
|---|---|---|
| 1 | Claude Opus 4.6 + OpenEvolve | 100.0 |
| 2 | GPT-OSS + ShinkaiEvolve | 98.7 |
| 3 | Claude Opus 4.6 + ShinkaiEvolve | 98.7 |
| 4 | GPT-OSS + OpenEvolve | 56.9 |
| 5 | Claude Opus 4.6 + ABMCTS | 41.1 |
| 6 | GPT-OSS + ABMCTS | 0.0 |
Score is the normalized score for this task (0–100, higher is better).