Uploaded model

  • Developed by: RushabhShah122000
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3.2-3b-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Code to run

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="RushabhShah122000/model",
    filename="unsloth.Q8_0.gguf",
)

output = llm(
    "Bedtime story",
    max_tokens=512,
    echo=True
)
story_text = output['choices'][0]['text']
print(story_text)

Downloads last month
8
GGUF
Model size
3.21B params
Architecture
llama

8-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.