Edit model card

Quantizations of https://huggingface.co/jondurbin/bagel-8b-v1.0

From original readme

Prompt formatting

This model uses the llama-3-instruct prompt template, and is provided in the tokenizer config. You can use the apply_chat_template method to accurate format prompts, e.g.:

import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("jondurbin/bagel-8b-v1.0", trust_remote_code=True)
chat = [
  {"role": "system", "content": "You are Bob, a friendly AI assistant."},
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "I'd like to show off how chat templating works!"},
]
print(tokenizer.apply_chat_template(chat, tokenize=False))
Downloads last month
593
GGUF
Model size
8.03B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.