qnguyen3 commited on
Commit
0dd55ce
1 Parent(s): e001fe2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -15,4 +15,34 @@ Hello world!<|im_end|>
15
 
16
  ## Model Description
17
 
18
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ## Model Description
17
 
18
+ More information needed
19
+
20
+ ## Training procedure
21
+
22
+ ### Training hyperparameters
23
+
24
+ The following hyperparameters were used during training:
25
+ - learning_rate: 2e-06
26
+ - train_batch_size: 2
27
+ - eval_batch_size: 2
28
+ - seed: 42
29
+ - distributed_type: multi-GPU
30
+ - num_devices: 4
31
+ - gradient_accumulation_steps: 4
32
+ - total_train_batch_size: 32
33
+ - total_eval_batch_size: 8
34
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
35
+ - lr_scheduler_type: cosine
36
+ - lr_scheduler_warmup_steps: 100
37
+ - num_epochs: 1
38
+
39
+ ### Training results
40
+
41
+
42
+
43
+ ### Framework versions
44
+
45
+ - Transformers 4.34.1
46
+ - Pytorch 2.0.1+cu118
47
+ - Datasets 2.14.6
48
+ - Tokenizers 0.14.1