eunyounglee commited on
Commit
6b5c971
1 Parent(s): c158d24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -13
README.md CHANGED
@@ -3,24 +3,44 @@ language:
3
  - vie
4
  pipeline_tag: text-generation
5
 
6
- Trained: Pretrain
7
  Config file: 2.7B
8
- Data: Vietnamese Dataset 450GB(CulturaX) + Project(1.3B) + Crawled Vietnamese Wikipedia(630MB) + viwik18(1.27GB)
9
  ---
10
  # Model Card for Model ID
11
 
12
- <!-- Provide a quick summary of what the model is/does. -->
13
 
14
- Pretrained GPT-NeoX model with 450GB+ Vietnamese dataset. Took about 17 hours to reach 80,000 iterations. Trained on A100 40GB GPU and 48 core CPU.
15
 
16
  ## Model Details
17
 
18
- ### Model Description
19
-
20
- <!-- Provide a longer summary of what this model is. -->
21
-
22
-
23
-
24
- - **Developed by:** Eunyoung Lee
25
- - **Model type:** GPT-NeoX
26
- - **Language(s) (NLP):** Vietnamese
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  - vie
4
  pipeline_tag: text-generation
5
 
6
+ Trained: Pre-train
7
  Config file: 2.7B
 
8
  ---
9
  # Model Card for Model ID
10
 
11
+ This model is pretrained with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
12
 
 
13
 
14
  ## Model Details
15
 
16
+ ### Training Data
17
+ - **Pre-train:**
18
+ Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
19
+
20
+ ### Training Hardware
21
+ Trained on A100 40GB GPU and 48 core CPU. Took about 17 hours to reach 80,000 steps.
22
+
23
+ ### Hyperparameters
24
+ <figure style="width:30em">
25
+
26
+ | Hyperparameter | Value |
27
+ | ---------------------- | ----------- |
28
+ | n<sub>parameters</sub> | 2670182400 |
29
+ | n<sub>layers</sub> | 32 |
30
+ | d<sub>model</sub> | 2560 |
31
+ | n<sub>heads</sub> | 32 |
32
+ | d<sub>head</sub> | 128 |
33
+ | n<sub>vocab</sub> | 60000 |
34
+ | Sequence Length | 2048 |
35
+ | Learning Rate | 0.00016 |
36
+ | Positional Encoding | [Rotary Position Embedding (RoPE)](https://arxiv.org/abs/2104.09864) |
37
+ </figure>
38
+
39
+ ### How to use
40
+ The model can be loaded using the `AutoModelForCausalLM` functionality:
41
+ ```python
42
+ from transformers import AutoTokenizer, AutoModelForCausalLM
43
+
44
+ tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
45
+ model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-pretrained")
46
+ ```