Claude Haiku 4.5 (October 2025)
Performance overview across all HAL benchmarks
5
Benchmarks
4
Agents
0
Pareto Optimal Benchmarks
Token Pricing
$1
Input Tokens
per 1M tokens
$5
Output Tokens
per 1M tokens
Benchmark Performance
On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.
| Benchmark | Agent | Accuracy | Cost | On the Pareto Frontier? |
|---|---|---|---|---|
|
Corebench Hard
|
CORE-Agent | 11.11% | $43.93 | No |
|
Gaia
|
HAL Generalist Agent | 56.36% | $130.81 | No |
|
Scicode
|
Scicode Tool Calling Agent | 0.00% | $232.36 | No |
|
Scienceagentbench
|
SAB Self-Debug | 18.63% | $2.66 | No |
|
Swebench Verified Mini
|
HAL Generalist Agent | 24.00% | $208.56 | No |