shahidul034
/

text_generation_bangla_model

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

shahidul034 commited on Nov 24, 2023

Commit

e60f2d6

•

1 Parent(s): 34f3064

Update README.md

Files changed (1) hide show

README.md +45 -17

README.md CHANGED Viewed

@@ -1,19 +1,20 @@
 # text_generation_bangla_model
 BanglaCLM dataset:
-OSCAR: 12.84GB
-Wikipedia dump: 6.24GB
-ProthomAlo: 3.92GB
-Kalerkantho: 3.24GB
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
 The BanglaCLM data set is divided into a training set (90%)and a validation set (10%).
@@ -24,18 +25,30 @@ The BanglaCLM data set is divided into a training set (90%)and a validation set
 ### Training hyperparameters
 The following hyperparameters were used during training:
-Batch size: 32
-Initial learning rate: 5e-5
-Number of warmup steps: 10000
-Weight decay rate: 0.01
-Tokenization algorithm: BPE
-Vocabulary size of tokenizer: 50256
-Total trainable params: 124,439,808
-Epochs: 40
-Number of training steps: 40772228
 - training_precision: float32
 ### Training results
 perplexity score: 2.86.
@@ -45,3 +58,18 @@ perplexity score: 2.86.
 - TensorFlow 2.11.0
 - Datasets 2.10.0
 - Tokenizers 0.13.2

 # text_generation_bangla_model
 BanglaCLM dataset:
+- OSCAR: 12.84GB
+- Wikipedia dump: 6.24GB
+- ProthomAlo: 3.92GB
+- Kalerkantho: 3.24GB
+## Model description
+- context size : 128
 ## Training and evaluation data
 The BanglaCLM data set is divided into a training set (90%)and a validation set (10%).
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- Batch size: 32
+- Initial learning rate: 5e-5
+- Number of warmup steps: 10000
+- Weight decay rate: 0.01
+- Tokenization algorithm: BPE
+- Vocabulary size of tokenizer: 50256
+- Total trainable params: 124,439,808
+- Epochs: 40
+- Number of training steps: 40772228
 - training_precision: float32
 ### Training results
 perplexity score: 2.86.
 - TensorFlow 2.11.0
 - Datasets 2.10.0
 - Tokenizers 0.13.2
+### Citation
+If you find this model helpful, please cite.
+```
+@INPROCEEDINGS{10303383,
+  author={Salim, Md. Shahidul and Murad, Hasan and Das, Dola and Ahmed, Faisal},
+  booktitle={2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)},
+  title={BanglaGPT: A Generative Pretrained Transformer-Based Model for Bangla Language},
+  year={2023},
+  volume={},
+  number={},
+  pages={56-59},
+  doi={10.1109/ICICT4SD59951.2023.10303383}}
+```