evaluate-metric (Evaluate Metric)

Organization Card

🤗 Evaluate provides access to a wide range of evaluation tools. It covers a range of modalities such as text, computer vision, audio, etc. as well as tools to evaluate models or datasets.

It has three types of evaluations:

Metric: measures the performance of a model on a given dataset, usually by comparing the model's predictions to some ground truth labels -- these are covered in this space.
Comparison: used to compare the performance of two or more models on a single test dataset., e.g. by comparing their predictions to ground truth labels and computing their agreement -- covered in the Evaluate Comparison Spaces.
Measurement: for gaining more insights on datasets and model predictions based on their properties and characteristics -- covered in the Evaluate Measurement Spaces.

All three types of evaluation supported by the 🤗 Evaluate library are meant to be mutually complementary, and help our community carry out more mindful and responsible evaluation!

models 0

None public yet

datasets 0

None public yet

Evaluate Metric

AI & ML interests

Recent Activity

spaces 54

BLEU

SQuAD v2

ROUGE

Frugalscore

sMAPE

SacreBLEU

models 0

datasets 0

AI & ML interests

Recent Activity

Team members 5

spaces 54 Sort: Recently updated

BLEU

SQuAD v2

ROUGE

Frugalscore

sMAPE

SacreBLEU

models 0

datasets 0

spaces 54