Claude Sonnet 4 High (May 2025)
Performance overview across all HAL benchmarks
2
Benchmarks
3
Agents
0
Pareto Optimal Benchmarks
Token Pricing
$3
Input Tokens
per 1M tokens
$15
Output Tokens
per 1M tokens
Benchmark Performance
On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.
Benchmark | Agent | Accuracy | Cost | On the Pareto Frontier? |
---|---|---|---|---|
Corebench Hard
|
CORE-Agent | 33.33% | $100.48 | No |
Online Mind2Web
|
Browser-Use | 39.33% | $1609.92 | No |
Online Mind2Web
|
SeeAct | 36.67% | $326.41 | No |