DeepSeek R1 (May 2025)

Performance overview across all HAL benchmarks

Benchmarks

Agents

Pareto Optimal Benchmarks

Token Pricing

$0.55

Input Tokens

per 1M tokens

$2.19

Output Tokens

per 1M tokens

Benchmark Performance

On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.

Benchmark	Agent	Accuracy	Cost	On the Pareto Frontier?
Assistantbench	Browser-Use	8.75%	$18.18	No
Corebench Hard	HAL Generalist Agent	8.89%	$7.77	No
Scicode	Scicode Zero Shot Agent	0.00%	$2.19	No
Scicode	Scicode Tool Calling Agent	0.00%	$57.62	No