mrm8488 commited on
Commit
9af29c5
1 Parent(s): c4b1a7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -7
README.md CHANGED
@@ -39,17 +39,36 @@ Meta developed and publicly released the Llama 2 family of large language models
39
 
40
  ### Training hyperparameters ⚙
41
 
42
- TBA
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
  ### Training results 🗒️
45
 
 
46
  | Step | Training Loss | Validation Loss |
47
- |------|---------------|-----------------|
48
- | 100 | 0.798500 | 0.767996 |
49
- | 200 | 0.725900 | 0.749880 |
50
- | 300 | 0.669100 | 0.748029 |
51
- | 400 | 0.687300 | 0.742342 |
52
- | 500 | 0.579900 | 0.736735 |
53
 
54
 
55
 
 
39
 
40
  ### Training hyperparameters ⚙
41
 
42
+ ```py
43
+ optim="paged_adamw_32bit",
44
+ num_train_epochs = 2,
45
+ eval_steps=50,
46
+ save_steps=50,
47
+ evaluation_strategy="steps",
48
+ save_strategy="steps",
49
+ save_total_limit=2,
50
+ seed=66,
51
+ load_best_model_at_end=True,
52
+ logging_steps=1,
53
+ learning_rate=2e-4,
54
+ fp16=True,
55
+ bf16=False,
56
+ max_grad_norm=0.3,
57
+ warmup_ratio=0.03,
58
+ group_by_length=True,
59
+ lr_scheduler_type="constant"
60
+ ```
61
 
62
  ### Training results 🗒️
63
 
64
+
65
  | Step | Training Loss | Validation Loss |
66
+ |------|----------|----------|
67
+ | 50 | 0.624400 | 0.600070 |
68
+ | 100 | 0.634100 | 0.592757 |
69
+ | 150 | 0.545800 | 0.586652 |
70
+ | 200 | 0.572500 | 0.577525 |
71
+ | 250 | 0.528000 | 0.590118 |
72
 
73
 
74