Claude Sonnet 4.5 High (September 2025)
Performance overview across all HAL benchmarks
4
Benchmarks
5
Agents
1
Pareto Optimal Benchmarks
Token Pricing
$3
Input Tokens
per 1M tokens
$15
Output Tokens
per 1M tokens
Benchmark Performance
On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.
Benchmark | Agent | Accuracy | Cost | On the Pareto Frontier? |
---|---|---|---|---|
Assistantbench
|
Browser-Use | 11.80% | $99.23 | No |
Corebench Hard
|
CORE-Agent | 44.44% | $92.34 | Yes |
Corebench Hard
|
HAL Generalist Agent | 28.89% | $87.77 | No |
Gaia
|
HAL Generalist Agent | 70.91% | $179.86 | No |
Gaia
|
HF Open Deep Research | 30.91% | $535.00 | No |
Scicode
|
Scicode Tool Calling Agent | 1.54% | $118.14 | No |