Edit model card

Uploaded model

  • Developed by: suriya7
  • License: apache-2.0
  • Finetuned from model : unsloth/gemma-2b

This Gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.

Requirements

pip install torch
pip install transformers

Inference In Notebook

import torch

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import TextStreamer

tokenizer = AutoTokenizer.from_pretrained("suriya7/Gemma-2b-SFT")
model = AutoModelForCausalLM.from_pretrained("suriya7/Gemma-2b-SFT")
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
inputs = tokenizer(
[
    alpaca_prompt.format(
        "You are an AI assistant. Please ensure that the answers conclude with an end-of-sequence (EOS) token.", # instruction
        "how to cook pizza?", # input goes here
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to(device)

text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 250,do_sample=True,temperature=0.7,top_k=2,repetition_penalty=1.5,  # Penalize repeated responses
eos_token_id=model.config.eos_token_id)

Recommended Prompt Template

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""
Downloads last month
3
Safetensors
Model size
2.51B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using suriya7/Gemma-2b-SFT 1