Text Generation
Transformers
Safetensors
English
llama
conversational
Eval Results
Inference Endpoints
text-generation-inference
Edit model card

A Llama Chat Model of 160M Parameters

Recommended Prompt Format

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant

Recommended Inference Parameters

penalty_alpha: 0.5
top_k: 4
repetition_penalty: 1.01

Usage Example

from transformers import pipeline

generate = pipeline("text-generation", "Felladrin/Llama-160M-Chat-v1")

messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant who answers user's questions with details and curiosity.",
    },
    {
        "role": "user",
        "content": "What are some potential applications for quantum computing?",
    },
]

prompt = generate.tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

output = generate(
    prompt,
    max_new_tokens=1024,
    penalty_alpha=0.5,
    top_k=4,
    repetition_penalty=1.01,
)

print(output[0]["generated_text"])

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 30.27
AI2 Reasoning Challenge (25-Shot) 24.74
HellaSwag (10-Shot) 35.29
MMLU (5-Shot) 26.13
TruthfulQA (0-shot) 44.16
Winogrande (5-shot) 51.30
GSM8k (5-shot) 0.00
Downloads last month
20,005
Safetensors
Model size
162M params
Tensor type
F32
Β·

Finetuned from

Datasets used to train Felladrin/Llama-160M-Chat-v1

Spaces using Felladrin/Llama-160M-Chat-v1 2

Collection including Felladrin/Llama-160M-Chat-v1

Evaluation results