language:
- en
pipeline_tag: text-generation
library_name: transformers
tags:
- cerebras
- LLM
inference: false
Instruction-tuned Cerebras GPT 111M
The smallest of cerebras GPT models with only 111M parameters instruction fine-tuned.
Model Description
Instruction fine-tuned cerebras-GPT-111M
Evaluation
The model has been evaluated with Huggingface's Open LLM leaderboard. Have a look at the leaderboard for more details: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard The performance of the instruction fine-tuned model does improve compared to the cerebras base model by about 5.7% (average score):
Model | Average | ARC (25-shot) | HellaSwag (10-shot) | MMLU (5-shot) | TruthfulQA (0-shot) |
---|---|---|---|---|---|
SebastianSchramm/Cerebras-GPT-111M-instruction | 31.6 | 24.3 | 26.2 | 26.5 | 49.5 |
cerebras/Cerebras-GPT-111M | 29.9 | 20 | 26.7 | 26.7 | 46.3 |
Training data
The model was fine-tuned with the following data: alpaca_gpt4_data (data generated by GPT-4 using Alpaca prompts for fine-tuning LLMs) and alpaca_data_cleaned.
Prompt template
Fine-tuning was performed with the promp template from stanford alpaca:
PROMPT_DICT = {
"prompt_input": (
"Below is an instruction that describes a task, paired with an input that provides further context. "
"Write a response that appropriately completes the request.\n\n"
"### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:"
),
"prompt_no_input": (
"Below is an instruction that describes a task. "
"Write a response that appropriately completes the request.\n\n"
"### Instruction:\n{instruction}\n\n### Response:"
),
}
Usage
It is recommended to format input according to the prompt template mentioned above during inference for best results.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 25.37 |
ARC (25-shot) | 24.4 |
HellaSwag (10-shot) | 26.05 |
MMLU (5-shot) | 25.87 |
TruthfulQA (0-shot) | 49.46 |
Winogrande (5-shot) | 51.62 |
GSM8K (5-shot) | 0.0 |
DROP (3-shot) | 0.17 |