Spaces:

ozayezerceli
/

PoCLeaderboard

Runtime error

PoCLeaderboard / README.md

Update README.md

07c9dc6 verified 3 months ago

895 Bytes

	---
	title: PoCLeaderboard
	emoji: 🏆
	colorFrom: green
	colorTo: pink
	sdk: gradio
	sdk_version: 5.4.0
	app_file: app.py
	pinned: false
	license: mit
	short_description: Example Leaderboard
	---
	This Space provides an interactive leaderboard for comparing language model performance across various benchmarks and custom tasks.

	## Features
	- Automated model evaluation using lm-evaluation-harness
	- Support for standard and custom benchmarks
	- Interactive visualization of results
	- Daily automated evaluations
	- Easy submission of new models and custom tasks

	## Usage
	1. Visit the Space to view current leaderboard
	2. Submit new models for evaluation
	3. Create custom evaluation tasks
	4. Track performance trends over time

	## Custom Task Format
	```json
	{
	"examples": [
	{
	"input": "question or prompt",
	"ideal": "expected answer",
	"metrics": ["accuracy", "f1"]
	}
	]
	}
	```