DeepSeek R1 (May 2025)

Performance overview across all HAL benchmarks

3
Benchmarks
4
Agents
0
Pareto Optimal Benchmarks

Token Pricing

$0.55
Input Tokens
per 1M tokens
$2.19
Output Tokens
per 1M tokens

Benchmark Performance

On the Pareto Frontier? indicates whether this model achieved a Pareto-optimal trade-off between accuracy and cost on that benchmark. Models on the Pareto frontier represent the current state-of-the-art efficiency for their performance level.

Benchmark Agent Accuracy Cost On the Pareto Frontier?
Assistantbench
Browser-Use 8.75% $18.18 No
Corebench Hard
HAL Generalist Agent 8.89% $7.77 No
Scicode
Scicode Zero Shot Agent 0.00% $2.19 No
Scicode
Scicode Tool Calling Agent 0.00% $57.62 No