PoCLeaderboard / README.md
ozayezerceli's picture
Update README.md
07c9dc6 verified
|
raw
history blame
895 Bytes
---
title: PoCLeaderboard
emoji: ๐Ÿ†
colorFrom: green
colorTo: pink
sdk: gradio
sdk_version: 5.4.0
app_file: app.py
pinned: false
license: mit
short_description: Example Leaderboard
---
This Space provides an interactive leaderboard for comparing language model performance across various benchmarks and custom tasks.
## Features
- Automated model evaluation using lm-evaluation-harness
- Support for standard and custom benchmarks
- Interactive visualization of results
- Daily automated evaluations
- Easy submission of new models and custom tasks
## Usage
1. Visit the Space to view current leaderboard
2. Submit new models for evaluation
3. Create custom evaluation tasks
4. Track performance trends over time
## Custom Task Format
```json
{
"examples": [
{
"input": "question or prompt",
"ideal": "expected answer",
"metrics": ["accuracy", "f1"]
}
]
}
```