Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Translator_Eng_Tel_instruct

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2120

Model description

This model is a fine-tuned version of the Mistral 7B Instruct model aimed at translating English text to Telugu. It has been fine-tuned using the QLoRA 4-bit technique for instruction fine-tuning.

Intended uses & limitations

This model is intended for translating English text to Telugu. It is recommended to use this model in environments that require high-quality translations between these two languages.

Usage:

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM

config = PeftConfig.from_pretrained("MRR24/Translator_Eng_Tel_instruct")
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
model = PeftModel.from_pretrained(base_model, "MRR24/Translator_Eng_Tel_instruct")

Training and evaluation data

The training dataset consists of 140k data points, while the testing dataset contains 16k data points. These datasets were meticulously curated to ensure the high-quality translation capability of the model.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
0.2549 0.9992 956 0.2554
0.2138 1.9984 1912 0.2120

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.1
  • Pytorch 2.1.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for