Agent: GPT-5.2 (medium)

Reliability Dimensions by Benchmark
Accuracy 42.6%
Consistency 0.58
Predictability 0.70
Robustness 0.95
Safety 0.98

Reliability 0.74