Meta-Llama-3.1-8B-BNB-4BIT Fine-Tuned Model

This is a fine-tuned version of the unsloth/meta-llama-3.1-8b-bnb-4bit model, optimized for Banglish-to-Bangla transliteration tasks. The model was trained using Unsloth and Hugging Face's TRL library, achieving significant speedups and efficient memory usage with 4-bit quantization.

Model Summary

Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit
Developed by: Tamim18
License: Apache-2.0
Languages: Bengali (bn), Banglish (Bengali written in Latin script)
Tags:
- Text Generation
- Transliteration
- Low-resource NLP
Optimized with: Unsloth

Features

4-bit Quantization: The model uses 4-bit quantization to reduce memory usage without compromising accuracy, making it suitable for resource-constrained environments.
Fast Training: Fine-tuned with Unsloth's RoPE scaling for extended sequence lengths and efficient GPU utilization.
High Accuracy: Achieves state-of-the-art performance in Banglish-to-Bangla transliteration tasks.

Model Details

Training Dataset

Dataset: SKNahin/bengali-transliteration-data
Task: Transliteration of Banglish (Bengali in Latin script) to Bengali.

Training Pipeline

The model was fine-tuned using Hugging Face's Trainer and Unsloth's LoRA-based optimization:

Sequence Length: 2048 tokens (supports long-context tasks).
Batch Size: Adjusted with gradient accumulation for memory efficiency.
Learning Rate: 2e-4 with a cosine decay scheduler.
LoRA: Applied LoRA fine-tuning on key attention layers (q_proj, k_proj, etc.).

How to Use

Load the Model

You can load this fine-tuned model directly using the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Tamim18/meta-llama-3.1-8b-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to("cuda")

# Example input
banglish_text = "apnar ki khobor?"
input_prompt = f"### Banglish Text:\n{banglish_text}\n\n### Bengali Translation:"
inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")

# Generate translation
outputs = model.generate(inputs["input_ids"], max_new_tokens=50, num_beams=5)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translation.split("### Bengali Translation:")[-1].strip())