leaderboard-pr-bot's picture
Adding Evaluation Results
461ca8e
|
raw
history blame
1.13 kB
metadata
license: apache-2.0

Model Card for Model ID

This is a lora version ToolLLaMA model introduced in ToolBench.

Model Details

Model Description

  • License: apache-2.0
  • Finetuned from model [optional]: LLaMA-7b

Uses

Refer to ToolBench.

Training Details

Trained with the new version data in ToolBench.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 43.42
ARC (25-shot) 52.99
HellaSwag (10-shot) 78.62
MMLU (5-shot) 46.87
TruthfulQA (0-shot) 38.67
Winogrande (5-shot) 74.35
GSM8K (5-shot) 6.82
DROP (3-shot) 5.61