Edit model card

Indic-gemma-7b-finetuned-sft-Navarasa

This model is based on google/gemma-7b and hase been LoRA finetuned on 9 Indian languages and English language instruction datasets:

  1. Hindi - ravithejads/samvaad-hi-filtered, HydraIndicLM/hindi_alpaca_dolly_67k(sampled)

  2. Telugu - Telugu-LLM-Labs/yahma_alpaca_cleaned_telugu_filtered_and_romanized, Telugu-LLM-Labs/teknium_GPTeacher_general_instruct_telugu_filtered_and_romanized

  3. Tamil - abhinand/tamil-alpaca

  4. Kannada - Tensoic/airoboros-3.2_kn, Tensoic/gpt-teacher_kn

  5. Malayalam - VishnuPJ/Alpaca_Instruct_Malayalam

  6. Gujarati - Tensoic/Alpaca-Gujarati

  7. Punjabi - HydraIndicLM/punjabi_alpaca_52K

  8. Bengali - HydraIndicLM/bengali_alpaca_dolly_67k(alpaca filtered)

  9. Odia - OdiaGenAI/Odia_Alpaca_instructions_52k, OdiaGenAI/gpt-teacher-roleplay-odia-3k

  10. English - yahma/alpaca-cleaned

The model is finetuned using unsloth library and we provide inference code using the same for faster inference. Alternatively you can use HuggingFace Library for inference.

Training Details:

The model is trained on approx 500K instruction samples.

  1. GPU: 1 A100, 80GB
  2. Time: 36.5 Hours
  3. Platform: E2E Networks

Installation

!pip install "unsloth[colab-ampere] @git+https://github.com/unslothai/unsloth.git"

Input Text Format

### Instruction: {instruction}

### Input: {input}

## Response: {response}

Inference With Unsloth

from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = False 
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    device_map="auto"
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference

input_prompt = """
### Instruction:
{}

### Input:
{}

### Response:
{}"""

input_text = input_prompt.format(
        "Tranlsate following sentence to Hindi.", # instruction
        "This model is developed by Telugu LLM Labs", # input
        "", # output - leave this blank for generation!
    )

inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)

Inference with HuggingFace

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

model = AutoPeftModelForCausalLM.from_pretrained(
    "Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa",
    load_in_4bit = False,
    token = hf_token
)
tokenizer = AutoTokenizer.from_pretrained("Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa")

input_prompt = """
### Instruction:
{}

### Input:
{}

### Response:
{}"""

input_text = input_prompt.format(
        "Tranlsate following sentence to Hindi.", # instruction
        "This model is developed by Telugu LLM Labs", # input
        "", # output - leave this blank for generation!
    )

inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)[0]

Refer to the blog post for sample examples.

Please check our Code Repository for training and inference scripts.

Developers:

The model is a collaborative effort by Ravi Theja and Ramsri Goutham. Feel free to DM either of us if you have any questions.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa

Collection including Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa