BrowserBC
Scalable Behaviour Cloning on Browser Using via Skill Distillation.
BrowserBC distills successful human browser trajectories into reusable skills, helping web agents solve tasks with higher success rates and fewer interactions.
Core Idea
Human browsing trajectories contain more than clicks. They encode efficient paths, site-specific logic, and practical decisions that are hard for agents to infer from the current page alone.
BrowserBC turns these trajectories into reusable skills. Those skills give agents prior knowledge for acting under incomplete information, moving them from simply operating websites to operating them efficiently.
Method
BrowserBC extracts task evidence from human runs, summarizes it into skills, and retrieves relevant skills during new browser tasks.
Results
BrowserBC improves both success rate and interaction efficiency across WebArena-Hard and ClawBench.
| Benchmark | Group | Skill-off | BrowserBC | Gain |
|---|---|---|---|---|
| WebArena-Hard 258 tasks | Overall | 60.5 | 81.4 | +20.9 |
| GitLab | 64.9 | 86.0 | +21.1 | |
| Shopping | 60.7 | 89.3 | +28.6 | |
| Shopping admin | 56.4 | 70.9 | +14.5 | |
| 78.6 | 85.7 | +7.1 | ||
| Multi-site | 43.8 | 75.0 | +31.2 | |
| ClawBench 152 tasks | Overall | 32.9 | 68.4 | +35.5 |
| Daily | 24.6 | 64.9 | +40.3 | |
| Finance | 50.0 | 100.0 | +50.0 | |
| Work | 47.1 | 76.5 | +29.4 | |
| Dev | 33.3 | 66.7 | +33.4 | |
| Academic | 50.0 | 78.6 | +28.6 | |
| Travel | 38.5 | 76.9 | +38.4 | |
| Social | 25.0 | 56.2 | +31.2 | |
| Pets | 27.3 | 54.5 | +27.2 |
Efficiency
Mean WebArena-Hard tool calls drop from 31.2 to 22.7, median calls drop from 24 to 16, and Sonnet-distilled skills lift Qwen from 53% to 77%.
Live Case Demos
Watch skill-guided agents complete real browser tasks. Open demos.