Agent:
GPT-5.2
Reliability Dimensions by Benchmark
Accuracy
59.2%
Consistency
0.76
Predictability
0.68
Robustness
0.95
Safety
0.95
Overall
0.80
Accuracy
42.0%
Consistency
0.72
Predictability
0.55
Robustness
0.97
Safety
0.93
Overall
0.75