Agent: GPT-4o Mini

Reliability Dimensions by Benchmark
Accuracy 22.0%
Consistency 0.63
Predictability 0.69
Robustness 0.87
Safety 1.00

Reliability 0.73
Accuracy 32.1%
Consistency 0.76
Predictability 0.41
Robustness 0.91
Safety 0.81

Reliability 0.69
Accuracy 21.3%
Consistency 0.76
Predictability 0.32
Robustness 0.92
Safety 0.76

Reliability 0.67