metadata

license: llama3
library_name: peft
base_model: unsloth/llama-3-8b-bnb-4bit
model-index:
  - name: Llama3_8B_Odia_Unsloth
    results: []

Llama3_8B_Odia_Unsloth

Llama3_8B_Odia_Unsloth is a fine-tuned Odia large language model with 8 billion parameters, and it is based on Llama3. The model is fine-tuned on a comprehensive 171k Odia instruction set, encompassing domain-specific and cultural nuances.

The fine-tuning process leverages Unsloth, expediting the training process for optimal efficiency.

For more details about the model, data, training procedure, and evaluations, go through the blog post.

Model Description

Model type: A 8B fine-tuned model
Primary Language(s): Odia and English
License: Llama3

Inference

Sample inference script.

#Install Unsloth
%%capture
import torch
major_version, minor_version = torch.cuda.get_device_capability()
# Must install separately since Colab has torch 2.2.1, which breaks packages
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
if major_version >= 8:
    # Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)
    !pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
else:
    # Use this for older GPUs (V100, Tesla T4, RTX 20xx)
    !pip install --no-deps xformers trl peft accelerate bitsandbytes
pass


from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "OdiaGenAI-LLM/Llama3_8B_Odia_Unsloth",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

FastLanguageModel.for_inference(model) 
inputs = tokenizer(
[
    alpaca_prompt.format(
        "କୋଭିଡ୍ 19 ର ଲକ୍ଷଣଗୁଡ଼ିକ କ’ଣ?", # instruction
        "", # input
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 512, use_cache = True)
tokenizer.batch_decode(outputs)

Citation Information

If you find this model useful, please consider giving 👏 and citing:

@misc{Llama3_8B_Odia_Unsloth,
  author = {Shantipriya Parida and Sambit Sekhar and Debasish Dhal and Shakshi Panwar},
  title = {OdiaGenAI Releases Llama3 Fine-tuned Model for the Odia Language},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/OdiaGenAI}},
}

Contributions

Dr.Shantipriya Parida
Sambit Sekhar
Debasish Dhal
Shakshi Panwar