SebastianSchramm
/

Cerebras-GPT-111M-instruction

Text Generation

text-generation-inference

Model card Files Files and versions Community

Cerebras-GPT-111M-instruction / README.md

SebastianSchramm's picture

SebastianSchramm

adding basemodel link

334dea5 about 1 year ago

|

2.46 kB

	---
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- cerebras
	- LLM
	inference: false
	base_model: cerebras/Cerebras-GPT-111M
	---

	# Instruction-tuned Cerebras GPT 111M

	The smallest of [cerebras GPT models](https://huggingface.co/cerebras) with only 111M parameters instruction fine-tuned.

	## Model Description

	Instruction fine-tuned [cerebras-GPT-111M](https://huggingface.co/cerebras/Cerebras-GPT-111M)

	## Evaluation

	The model has been evaluated with Huggingface's Open LLM leaderboard. Have a look at the leaderboard for more details: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
	The performance of the instruction fine-tuned model does improve compared to the cerebras base model by about 5.7% (average score):

	Model \| Average \| ARC (25-shot) \| HellaSwag (10-shot) \| MMLU (5-shot) \| TruthfulQA (0-shot)
	--- \| --- \| --- \| --- \| --- \| ---
	SebastianSchramm/Cerebras-GPT-111M-instruction \| 31.6 \| 24.3 \| 26.2 \| 26.5 \| 49.5
	cerebras/Cerebras-GPT-111M \| 29.9 \| 20 \| 26.7 \| 26.7 \| 46.3
	\|\|\|\|\|\|

	## Training data

	The model was fine-tuned with the following data: [alpaca_gpt4_data](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data.json) (data generated by GPT-4 using Alpaca prompts for fine-tuning LLMs) and [alpaca_data_cleaned](https://github.com/tloen/alpaca-lora/blob/a3027fea37c2087b8b0131b21a4cd948bbdcd9e0/alpaca_data_cleaned.json).

	## Prompt template

	Fine-tuning was performed with the promp template from [stanford alpaca](https://github.com/tatsu-lab/stanford_alpaca):

	```python
	PROMPT_DICT = {
	"prompt_input": (
	"Below is an instruction that describes a task, paired with an input that provides further context. "
	"Write a response that appropriately completes the request.\n\n"
	"### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:"
	),
	"prompt_no_input": (
	"Below is an instruction that describes a task. "
	"Write a response that appropriately completes the request.\n\n"
	"### Instruction:\n{instruction}\n\n### Response:"
	),
	}
	```

	## Usage

	It is recommended to format input according to the prompt template mentioned above during inference for best results.