Agent: GPT-5.2 (xhigh)

Reliability Dimensions by Benchmark
Accuracy 67.7%
Consistency 0.70
Predictability 0.78
Robustness 0.96
Safety 0.95

Reliability 0.81
Accuracy 51.6%
Consistency 0.67
Predictability 0.65
Robustness 0.95
Safety 0.94

Reliability 0.76