Agent: GPT-5.2

Reliability Dimensions by Benchmark
Accuracy 29.9%
Consistency 0.62
Predictability 0.72
Robustness 0.88
Safety 0.99

Overall 0.74
Accuracy 59.2%
Consistency 0.76
Predictability 0.68
Robustness 0.95
Safety 0.95

Overall 0.80
Accuracy 42.0%
Consistency 0.72
Predictability 0.55
Robustness 0.97
Safety 0.93

Overall 0.75