WirelessChannelSimulation Task 47 / 47

HighReliableSimulation

Estimate very low BER for Hamming(127,120) over AWGN where naive Monte Carlo is inefficient: design importance sampling or variance-reduction samplers for deep-error events. Fixed evaluator settings score statistical efficiency and correctness—wireless link reliability engineering with rare-event simulation.

Model leaderboard

#	Participant	Score
1	SEED 2.0 Pro	100.0
2	Claude Opus 4.6	83.9
3	DeepSeek V3.2	83.4
4	Qwen3 Coder Next	39.5
5	GLM-5	23.1
6	Grok 4.20	19.9
7	Gemini 3.1 Pro Preview	2.3
8	GPT-5.4	0.0

Framework leaderboard

#	Participant	Score
1	GPT-OSS + ABMCTS	100.0
2	Claude Opus 4.6 + OpenEvolve	33.4
3	GPT-OSS + OpenEvolve	25.4
4	Claude Opus 4.6 + ABMCTS	15.2
5	Claude Opus 4.6 + ShinkaiEvolve	14.6
6	GPT-OSS + ShinkaiEvolve	0.0

Score is the normalized score for this task (0–100, higher is better).