Ionio-ai
/

fine-tuned-PL_BERT-hindi

Model card Files Files and versions Community

fine-tuned-PL_BERT-hindi / README.md

meAtIonio's picture

Update README.md

95ef72e verified 3 months ago

|

history blame contribute delete

1.35 kB

	# PL-BERT Fine-Tuned on Hindi Wikipedia Dataset

	This model is a fine-tuned version of PL-BERT, specifically trained on the Hindi subset of the Wiki40b dataset. The model has been optimized to understand and generate high-quality Hindi text, making it suitable for various NLP tasks in the Hindi language.
	For more information about this model, check out the [GitHub](https://github.com/Ionio-io/PL-BERT-Fine-Tuned-hi-) repository.

	## Model Overview

	- Model Name: PL-BERT (Fine-tuned on Hindi)
	- Base Model: PL-BERT (Multilingual BERT variant)
	- Dataset: Hindi subset from Wiki40b (51,000 cleaned Wikipedia articles)
	- Precision: Mixed precision (FP16)

	The fine-tuning process focused on improving the model's ability to handle Hindi text more effectively by leveraging a large, cleaned corpus of Wikipedia articles in Hindi.

	## Training Details

	- Model: PL-BERT
	- Dataset: Hindi subset from Wiki40b
	- Batch Size: 64
	- Mixed Precision: FP16
	- Optimizer: AdamW
	- Training Steps: 15,000

	### Training Progress

	- Final Loss: 1.879
	- Vocabulary Loss: 0.49
	- Token Loss: 1.465

	### Validation Results

	During training, we monitored performance with validation metrics:

	- Validation Loss: 1.879
	- Vocabulary Accuracy: 78.54%
	- Token Accuracy: 82.30%


	---
	license: apache-2.0
	---