Navers lab
← All tasks
Kernel Engineering Task 20 / 47

TriMul

This benchmark asks for a high-performance TriMul-style GPU kernel under strict correctness, trading off tiling, layout, and occupancy—often VRAM-bound on consumer GPUs. Evaluation runs representative workloads and scores both accuracy and speed against the benchmark's reference, highlighting specialized GEMM-like kernel engineering.

Model leaderboard

# Participant Score
1 Claude Opus 4.6 100.0
2 Grok 4.20 37.9
3 GLM-5 20.4
4 DeepSeek V3.2 12.2
5 SEED 2.0 Pro 12.0
6 Gemini 3.1 Pro Preview 2.2
7 Qwen3 Coder Next 0.4
8 GPT-5.4 0.0

Framework leaderboard

# Participant Score
1 Claude Opus 4.6 + ShinkaiEvolve 100.0
2 Claude Opus 4.6 + ABMCTS 10.0
3 GPT-OSS + ABMCTS 7.1
4 GPT-OSS + OpenEvolve 4.3
5 Claude Opus 4.6 + OpenEvolve 2.3
6 GPT-OSS + ShinkaiEvolve 0.0

Score is the normalized score for this task (0–100, higher is better).