GPT-JT-6B-v0 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
6adfb89
metadata
language:
  - en
datasets:
  - natural_instructions
  - the_pile
  - cot
  - Muennighoff/P3
tags:
  - gpt
pipeline_tag: text-generation
inference:
  parameters:
    temperature: 0.1
widget:
  - text: >-
      Is this review positive or negative? Review: Best cast iron skillet you
      will ever buy. Answer:
    example_title: Sentiment analysis
  - text: 'Where is Zurich? Ans:'
    example_title: Question Answering

Quick Start

from transformers import pipeline

pipe = pipeline(model='togethercomputer/GPT-JT-6B-v0')

pipe("Where is Zurich? Ans:")

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 38.37
ARC (25-shot) 42.06
HellaSwag (10-shot) 67.96
MMLU (5-shot) 49.34
TruthfulQA (0-shot) 38.89
Winogrande (5-shot) 64.8
GSM8K (5-shot) 1.21
DROP (3-shot) 4.31