shahidul034 commited on
Commit
e60f2d6
1 Parent(s): 34f3064

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -17
README.md CHANGED
@@ -1,19 +1,20 @@
1
 
2
  # text_generation_bangla_model
3
  BanglaCLM dataset:
4
- OSCAR: 12.84GB
5
- Wikipedia dump: 6.24GB
6
- ProthomAlo: 3.92GB
7
- Kalerkantho: 3.24GB
8
 
 
9
 
10
- ## Model description
 
 
 
 
11
 
12
- More information needed
13
 
14
- ## Intended uses & limitations
 
 
15
 
16
- More information needed
17
 
18
  ## Training and evaluation data
19
  The BanglaCLM data set is divided into a training set (90%)and a validation set (10%).
@@ -24,18 +25,30 @@ The BanglaCLM data set is divided into a training set (90%)and a validation set
24
  ### Training hyperparameters
25
 
26
  The following hyperparameters were used during training:
27
- Batch size: 32
28
- Initial learning rate: 5e-5
29
- Number of warmup steps: 10000
30
- Weight decay rate: 0.01
31
- Tokenization algorithm: BPE
32
- Vocabulary size of tokenizer: 50256
33
- Total trainable params: 124,439,808
34
- Epochs: 40
35
- Number of training steps: 40772228
 
 
 
 
 
 
 
 
 
 
36
  - training_precision: float32
37
 
 
38
  ### Training results
 
39
  perplexity score: 2.86.
40
 
41
 
@@ -45,3 +58,18 @@ perplexity score: 2.86.
45
  - TensorFlow 2.11.0
46
  - Datasets 2.10.0
47
  - Tokenizers 0.13.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
  # text_generation_bangla_model
3
  BanglaCLM dataset:
 
 
 
 
4
 
5
+ - OSCAR: 12.84GB
6
 
7
+ - Wikipedia dump: 6.24GB
8
+
9
+ - ProthomAlo: 3.92GB
10
+
11
+ - Kalerkantho: 3.24GB
12
 
 
13
 
14
+ ## Model description
15
+
16
+ - context size : 128
17
 
 
18
 
19
  ## Training and evaluation data
20
  The BanglaCLM data set is divided into a training set (90%)and a validation set (10%).
 
25
  ### Training hyperparameters
26
 
27
  The following hyperparameters were used during training:
28
+
29
+ - Batch size: 32
30
+
31
+ - Initial learning rate: 5e-5
32
+
33
+ - Number of warmup steps: 10000
34
+
35
+ - Weight decay rate: 0.01
36
+
37
+ - Tokenization algorithm: BPE
38
+
39
+ - Vocabulary size of tokenizer: 50256
40
+
41
+ - Total trainable params: 124,439,808
42
+
43
+ - Epochs: 40
44
+
45
+ - Number of training steps: 40772228
46
+
47
  - training_precision: float32
48
 
49
+
50
  ### Training results
51
+
52
  perplexity score: 2.86.
53
 
54
 
 
58
  - TensorFlow 2.11.0
59
  - Datasets 2.10.0
60
  - Tokenizers 0.13.2
61
+
62
+ ### Citation
63
+ If you find this model helpful, please cite.
64
+ ```
65
+ @INPROCEEDINGS{10303383,
66
+ author={Salim, Md. Shahidul and Murad, Hasan and Das, Dola and Ahmed, Faisal},
67
+ booktitle={2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD)},
68
+ title={BanglaGPT: A Generative Pretrained Transformer-Based Model for Bangla Language},
69
+ year={2023},
70
+ volume={},
71
+ number={},
72
+ pages={56-59},
73
+ doi={10.1109/ICICT4SD59951.2023.10303383}}
74
+
75
+ ```