Claude Opus 4 (May 2025)

Performance overview across all HAL benchmarks

Benchmarks

Agents

Pareto Optimal Benchmarks

Token Pricing

$15

Input Tokens

per 1M tokens

$75

Output Tokens

per 1M tokens

Benchmark Performance

On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.

Benchmark	Agent	Accuracy	Cost	On the Pareto Frontier?
Gaia	HF Open Deep Research	57.58%	$1686.07	No
Gaia	HAL Generalist Agent	30.30%	$272.76	No
Swebench Verified Mini	SWE-Agent	50.00%	$1330.90	No
Swebench Verified Mini	HAL Generalist Agent	34.00%	$382.39	No
Taubench Airline	HAL Generalist Agent	44.00%	$150.15	No