Edit model card

Instruction-tuned Cerebras GPT 111M

The smallest of cerebras GPT models with only 111M parameters instruction fine-tuned.

Model Description

Instruction fine-tuned cerebras-GPT-111M

Evaluation

The model has been evaluated with Huggingface's Open LLM leaderboard. Have a look at the leaderboard for more details: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard The performance of the instruction fine-tuned model does improve compared to the cerebras base model by about 5.7% (average score):

Model Average ARC (25-shot) HellaSwag (10-shot) MMLU (5-shot) TruthfulQA (0-shot)
SebastianSchramm/Cerebras-GPT-111M-instruction 31.6 24.3 26.2 26.5 49.5
cerebras/Cerebras-GPT-111M 29.9 20 26.7 26.7 46.3

Training data

The model was fine-tuned with the following data: alpaca_gpt4_data (data generated by GPT-4 using Alpaca prompts for fine-tuning LLMs) and alpaca_data_cleaned.

Prompt template

Fine-tuning was performed with the promp template from stanford alpaca:

PROMPT_DICT = {
    "prompt_input": (
        "Below is an instruction that describes a task, paired with an input that provides further context. "
        "Write a response that appropriately completes the request.\n\n"
        "### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:"
    ),
    "prompt_no_input": (
        "Below is an instruction that describes a task. "
        "Write a response that appropriately completes the request.\n\n"
        "### Instruction:\n{instruction}\n\n### Response:"
    ),
}

Usage

It is recommended to format input according to the prompt template mentioned above during inference for best results.

Downloads last month
1,968
Safetensors
Model size
111M params
Tensor type
F32
Β·
Inference Examples
Inference API (serverless) has been turned off for this model.

Finetuned from

Spaces using SebastianSchramm/Cerebras-GPT-111M-instruction 20