Metric:
glue
Hosted in S3
Description
GLUE, the General Language Understanding Evaluation benchmark (https://gluebenchmark.com/) is a collection of resources for training, evaluating, and analyzing natural language understanding systems.
How to load this metric directly with the
datasets
library:
from datasets import load_metric metric = load_metric("glue")
Citation
@inproceedings{wang2019glue, title={{GLUE}: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding}, author={Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel R.}, note={In the Proceedings of ICLR.}, year={2019} }