Mandoline AI

company

https://mandoline.ai/

AI & ML interests

Evals

Recent Activity

mandoline-engineering updated a Space 30 days ago

Mandoline/Refusals-Leaderboard

mandoline-engineering updated a Space about 1 month ago

Mandoline/README

mandoline-engineering updated a Space 4 months ago

Mandoline/Refusals-Leaderboard

View all activity

Organization Card

Community About org cards

Mandoline

Mandoline helps developers evaluate and improve LLM applications in ways that matter to users.

Create custom metrics that align with your specific use case, evaluate LLM performance in real situations, and track improvements over time.

Documentation

Tutorials

Analysis & Insights

Leaderboards

Refusals

SDKs

Support

spaces 1

Refusals Leaderboard

Refusals by GPT-4o, o1-mini, o1-preview, Claude 3.5 Sonnet

models

None public yet

datasets

None public yet