Claude Sonnet 4.5 High (September 2025)

Performance overview across all HAL benchmarks

6
Benchmarks
7
Agents
3
Pareto Optimal Benchmarks

Token Pricing

$3
Input Tokens
per 1M tokens
$15
Output Tokens
per 1M tokens

Benchmark Performance

On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.

Benchmark Agent Accuracy Cost On the Pareto Frontier?
Assistantbench
Browser-Use 11.80% $99.23 No
Corebench Hard
CORE-Agent 44.44% $92.34 Yes
Corebench Hard
HAL Generalist Agent 28.89% $87.77 No
Gaia
HAL Generalist Agent 70.91% $179.86 No
Gaia
HF Open Deep Research 30.91% $535.00 No
Scicode
Scicode Tool Calling Agent 1.54% $118.14 No
Scienceagentbench
SAB Self-Debug 30.39% $7.47 Yes
Swebench Verified Mini
SWE-Agent 72.00% $463.90 Yes
Swebench Verified Mini
HAL Generalist Agent 40.00% $95.97 No