Kernel Engineering Task 18 / 47

FlashAttention

Optimize a causal scaled dot-product attention forward kernel (FlashAttention-style) for GPU execution while matching a reference numerically. The problem stresses tiled online softmax and memory locality. Scoring reports speed and correctness for fixed shapes, representing bandwidth-bound attention kernel work in production ML stacks.

Model leaderboard

#	Participant	Score
1	GPT-5.4	100.0
2	SEED 2.0 Pro	0.5
3	Gemini 3.1 Pro Preview	0.4
4	DeepSeek V3.2	0.4
5	Claude Opus 4.6	0.4
6	Qwen3 Coder Next	0.1
7	GLM-5	0.0
8	Grok 4.20	0.0

Framework leaderboard

#	Participant	Score
1	Claude Opus 4.6 + OpenEvolve	100.0
2	GPT-OSS + ShinkaiEvolve	98.7
3	Claude Opus 4.6 + ShinkaiEvolve	98.7
4	GPT-OSS + OpenEvolve	56.9
5	Claude Opus 4.6 + ABMCTS	41.1
6	GPT-OSS + ABMCTS	0.0

Score is the normalized score for this task (0–100, higher is better).