Ionio-ai
/

fine-tuned-PL_BERT-hindi

Model card Files Files and versions Community

meAtIonio commited on Nov 4, 2024

Commit

95ef72e

•

1 Parent(s): 2351480

Update README.md

Files changed (1) hide show

README.md +41 -3

README.md CHANGED Viewed

@@ -1,3 +1,41 @@
----
-license: apache-2.0
----

+# PL-BERT Fine-Tuned on Hindi Wikipedia Dataset
+This model is a fine-tuned version of **PL-BERT**, specifically trained on the Hindi subset of the Wiki40b dataset. The model has been optimized to understand and generate high-quality Hindi text, making it suitable for various NLP tasks in the Hindi language.
+For more information about this model, check out the [GitHub](https://github.com/Ionio-io/PL-BERT-Fine-Tuned-hi-) repository.
+## Model Overview
+- **Model Name:** PL-BERT (Fine-tuned on Hindi)
+- **Base Model:** PL-BERT (Multilingual BERT variant)
+- **Dataset:** Hindi subset from Wiki40b (51,000 cleaned Wikipedia articles)
+- **Precision:** Mixed precision (FP16)
+The fine-tuning process focused on improving the model's ability to handle Hindi text more effectively by leveraging a large, cleaned corpus of Wikipedia articles in Hindi.
+## Training Details
+- **Model:** PL-BERT
+- **Dataset:** Hindi subset from Wiki40b
+- **Batch Size:** 64
+- **Mixed Precision:** FP16
+- **Optimizer:** AdamW
+- **Training Steps:** 15,000
+### Training Progress
+- **Final Loss:** 1.879
+- **Vocabulary Loss:** 0.49
+- **Token Loss:** 1.465
+### Validation Results
+During training, we monitored performance with validation metrics:
+- **Validation Loss:** 1.879
+- **Vocabulary Accuracy:** 78.54%
+- **Token Accuracy:** 82.30%
+---
+license: apache-2.0
+---