gpt-Youtube / README.md
breadlicker45's picture
Adding Evaluation Results (#3)
f911b49
|
raw
history blame
810 Bytes
metadata
datasets:
  - breadlicker45/youtube-comments-180k
pipeline_tag: text-generation

this is trained on 180K YouTube comments.

this is trained for 100k steps.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 24.86
ARC (25-shot) 23.29
HellaSwag (10-shot) 26.34
MMLU (5-shot) 23.54
TruthfulQA (0-shot) 48.63
Winogrande (5-shot) 48.93
GSM8K (5-shot) 0.0
DROP (3-shot) 3.32