🍷 Llama-3.2-Nemotron-3B-Instruct

This is a finetune of meta-llama/Llama-3.2-3B-Instruct (specifically, unsloth/Llama-3.2-3B-Instruct-bnb-4bit).

It was trained on the nvidia/HelpSteer2 dataset, similar to nvidia/Llama-3.1-Nemotron-70B-Instruct-HF, using Unsloth.

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "itsnebulalol/Llama-3.2-Nemotron-3B-Instruct"
messages = [{"role": "user", "content": "How many r in strawberry?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
25
Safetensors
Model size
3.21B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for itsnebulalol/Llama-3.2-Nemotron-3B-Instruct

Finetuned
(201)
this model
Quantizations
9 models

Dataset used to train itsnebulalol/Llama-3.2-Nemotron-3B-Instruct