eunyounglee
commited on
Commit
•
fccfcee
1
Parent(s):
a788b47
Update README.md
Browse files
README.md
CHANGED
@@ -8,36 +8,30 @@ Config file: 2.7B
|
|
8 |
---
|
9 |
# Model Card for Model ID
|
10 |
|
11 |
-
This model is pretrained and fine-tuned with Vietnamese
|
12 |
-
|
13 |
|
14 |
## Model Details
|
15 |
|
16 |
### Training Data
|
17 |
- **Pre-train:**
|
18 |
-
Vietnamese
|
19 |
- **Fine-tuning:**
|
20 |
12MB Vietnamese Question & Answer dataset
|
21 |
Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
|
22 |
|
23 |
### Training Hardware
|
24 |
-
|
25 |
-
- **Model type:** GPT-NeoX
|
26 |
-
- **Language(s) (NLP):** Vietnamese
|
27 |
|
28 |
<figure style="width:30em">
|
29 |
|
30 |
| Hyperparameter | Value |
|
31 |
| ---------------------- | ----------- |
|
32 |
-
|
|
33 |
-
|
|
34 |
-
|
|
35 |
-
|
|
36 |
-
|
|
37 |
-
| n<sub>vocab</sub> | 60000 |
|
38 |
-
| Sequence Length | 2048 |
|
39 |
-
| Learning Rate | 0.00016 |
|
40 |
-
| Positional Encoding | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
|
41 |
</figure>
|
42 |
|
43 |
### How to use
|
|
|
8 |
---
|
9 |
# Model Card for Model ID
|
10 |
|
11 |
+
This model is pretrained and fine-tuned with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
|
12 |
+
|
13 |
|
14 |
## Model Details
|
15 |
|
16 |
### Training Data
|
17 |
- **Pre-train:**
|
18 |
+
Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
|
19 |
- **Fine-tuning:**
|
20 |
12MB Vietnamese Question & Answer dataset
|
21 |
Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
|
22 |
|
23 |
### Training Hardware
|
24 |
+
Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.
|
|
|
|
|
25 |
|
26 |
<figure style="width:30em">
|
27 |
|
28 |
| Hyperparameter | Value |
|
29 |
| ---------------------- | ----------- |
|
30 |
+
| num_train_epochs | 2670182400 |
|
31 |
+
| train_batch_size | 2 |
|
32 |
+
| learning_rate | 0.0001 |
|
33 |
+
| warmup_steps | 1000 |
|
34 |
+
| weight_decay | 0 |
|
|
|
|
|
|
|
|
|
35 |
</figure>
|
36 |
|
37 |
### How to use
|