A library for easily evaluating machine learning models and datasets.

With a single line of code, you get access to dozens of evaluation methods for different domains (NLP, Computer Vision, Reinforcement Learning, and more!). Be it on your local machine or in a distributed training setup, you can evaluate your models in a consistent and reproducible way! All evaluation methods come with an interactive widget to try it out directly in the browser and a documentation card that documents its use and limitations (see for example BLEU).