AI & ML interests

Evals

Recent Activity

Mandoline

Mandoline helps developers evaluate and improve LLM applications in ways that matter to users.

Create custom metrics that align with your specific use case, evaluate LLM performance in real situations, and track improvements over time.

Documentation

Tutorials

Analysis & Insights

Leaderboards

SDKs

Support

models

None public yet

datasets

None public yet