🤖 Agent Evaluation Runner (OpenAI)
Using GPT-4o-mini — fast, cheap, no quota issues!
🔍 Test OpenAI Key First
API Key Status
Sign in with Hugging Face
🚀 Run Evaluation & Submit
Status
Results
Results
1
⋮
2
⋮
3
⋮
1
⋮
2
⋮
3
⋮