Spaces

·

The AI App Directory

New Space What is Spaces?

GSMA Open-Telco LLM Benchmarks

Track, rank and evaluate Open Telecom LLMs and chatbots

LLM Healthcare Benchmarking

Evaluate medical AI models with datasets

LLVM APR Benchmark Leaderboard

Leaderboard for LLVM APR Benchmark

LingOly-TOO benchmark

Reasoning benchmark in linguistics

2D_profile Benchmark

Display competition info, datasets, leaderboards, rules, and submissions

2D_ElastoPlastoDynamics Benchmark

Fetch and display competition info and leaderboards

Fish Speech Benchmark

Non official benchmark by Fish Speech

Distributional RL Benchmark V2

Play Atari games with AI agents

Matcha Tts Onnx Benchmarks

Benchmark load model and tts time

2D_Multiscale_Hyperelasticity Benchmark

View and manage competition data

WebGPU Embedding Benchmark

Measure BERT model performance using WASM and WebGPU

Goodharts Law On Benchmarks

Compare LLM performance across benchmarks

Llm Calibration Benchmark

llm-calibration-benchmark

Premium Model Palindrome Benchmark

Generate palindromes and evaluate grammar across models

Food Weight Benchmark

Food detection and weight prediction benchmark

Mteb Human Benchmark

Manage and annotate your datasets

Polish Linguistic and Cultural Competency Benchmark

Display evaluation results in a leaderboard

PL-MTEB: Polish Massive Text Embedding Benchmark

Display evaluation results on a leaderboard

Billion Row Challenge _ NYC Taxi Data Processing Benchmark

Perform data preprocessing and benchmark different libraries

LitBench: A Graph-Centric Large Language Model Benchmarking Framework For Literature Tasks

Interact with scientific literature to generate abstracts, titles, and citations