QuantFactory/FineLlama-3.1-8B-GGUF

This is quantized version of mlabonne/FineLlama-3.1-8B created using llama.cpp

Original Model Card

🍷 FineLlama-3.1-8B

This is a finetune of meta-llama/Meta-Llama-3.1-8B made for my article "Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth".

It was trained on 100k super high-quality samples from the mlabonne/FineTome-100k dataset.

Try the demo: https://huggingface.co/spaces/mlabonne/FineLlama-3.1-8B

πŸ”Ž Applications

This model was made for educational purposes. I recommend using Meta's instruct model for real applications.

⚑ Quantization

πŸ† Evaluation

TBD.

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/FineLlama-3.1-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
75
GGUF
Model size
8.03B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for QuantFactory/FineLlama-3.1-8B-GGUF

Quantized
(240)
this model

Dataset used to train QuantFactory/FineLlama-3.1-8B-GGUF