Meta-Llama-3.1-8B-BNB-4BIT Fine-Tuned Model
This is a fine-tuned version of the unsloth/meta-llama-3.1-8b-bnb-4bit
model, optimized for Banglish-to-Bangla transliteration tasks. The model was trained using Unsloth and Hugging Face's TRL library, achieving significant speedups and efficient memory usage with 4-bit quantization.
Model Summary
- Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit
- Developed by: Tamim18
- License: Apache-2.0
- Languages: Bengali (bn), Banglish (Bengali written in Latin script)
- Tags:
- Text Generation
- Transliteration
- Low-resource NLP
- Optimized with: Unsloth
Features
- 4-bit Quantization: The model uses 4-bit quantization to reduce memory usage without compromising accuracy, making it suitable for resource-constrained environments.
- Fast Training: Fine-tuned with Unsloth's RoPE scaling for extended sequence lengths and efficient GPU utilization.
- High Accuracy: Achieves state-of-the-art performance in Banglish-to-Bangla transliteration tasks.
Model Details
Training Dataset
- Dataset: SKNahin/bengali-transliteration-data
- Task: Transliteration of Banglish (Bengali in Latin script) to Bengali.
Training Pipeline
The model was fine-tuned using Hugging Face's Trainer
and Unsloth's LoRA-based optimization:
- Sequence Length: 2048 tokens (supports long-context tasks).
- Batch Size: Adjusted with gradient accumulation for memory efficiency.
- Learning Rate: 2e-4 with a cosine decay scheduler.
- LoRA: Applied LoRA fine-tuning on key attention layers (
q_proj
,k_proj
, etc.).
How to Use
Load the Model
You can load this fine-tuned model directly using the transformers
library:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Tamim18/meta-llama-3.1-8b-bnb-4bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to("cuda")
# Example input
banglish_text = "apnar ki khobor?"
input_prompt = f"### Banglish Text:\n{banglish_text}\n\n### Bengali Translation:"
inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")
# Generate translation
outputs = model.generate(inputs["input_ids"], max_new_tokens=50, num_beams=5)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translation.split("### Bengali Translation:")[-1].strip())
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.