git+https://github.com/huggingface/evaluate@main datasets>=2.0.0 torch>=2.0.0 torchmetrics numpy