Peaky8linders commited on
Commit
e8172de
1 Parent(s): 2817a60

End of training

Browse files
Files changed (1) hide show
  1. README.md +16 -9
README.md CHANGED
@@ -32,7 +32,7 @@ data_seed: 49
32
  seed: 49
33
 
34
  datasets:
35
- - path: _synth_data/alpaca_synth_queries_healed.jsonl
36
  type: sharegpt
37
  conversation: alpaca
38
  dataset_prepared_path: last_run_prepared
@@ -66,11 +66,11 @@ lora_target_modules:
66
 
67
  gradient_accumulation_steps: 4
68
  micro_batch_size: 16
69
- eval_batch_size: 16
70
- num_epochs: 1
71
  optimizer: adamw_bnb_8bit
72
  lr_scheduler: cosine
73
- learning_rate: 0.002
74
  max_grad_norm: 1.0
75
  adam_beta2: 0.95
76
  adam_epsilon: 0.00001
@@ -116,7 +116,7 @@ save_safetensors: true
116
 
117
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
118
  It achieves the following results on the evaluation set:
119
- - Loss: 1.1640
120
 
121
  ## Model description
122
 
@@ -135,22 +135,29 @@ More information needed
135
  ### Training hyperparameters
136
 
137
  The following hyperparameters were used during training:
138
- - learning_rate: 0.002
139
  - train_batch_size: 16
140
- - eval_batch_size: 16
141
  - seed: 49
142
  - gradient_accumulation_steps: 4
143
  - total_train_batch_size: 64
144
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
145
  - lr_scheduler_type: cosine
146
  - lr_scheduler_warmup_steps: 20
147
- - num_epochs: 1
148
 
149
  ### Training results
150
 
151
  | Training Loss | Epoch | Step | Validation Loss |
152
  |:-------------:|:------:|:----:|:---------------:|
153
- | 1.1341 | 0.0006 | 1 | 1.1640 |
 
 
 
 
 
 
 
154
 
155
 
156
  ### Framework versions
 
32
  seed: 49
33
 
34
  datasets:
35
+ - path: _synth_data/alpaca_synth_queries_healed_sample.jsonl
36
  type: sharegpt
37
  conversation: alpaca
38
  dataset_prepared_path: last_run_prepared
 
66
 
67
  gradient_accumulation_steps: 4
68
  micro_batch_size: 16
69
+ eval_batch_size: 1
70
+ num_epochs: 2
71
  optimizer: adamw_bnb_8bit
72
  lr_scheduler: cosine
73
+ learning_rate: 0.0002
74
  max_grad_norm: 1.0
75
  adam_beta2: 0.95
76
  adam_epsilon: 0.00001
 
116
 
117
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
118
  It achieves the following results on the evaluation set:
119
+ - Loss: 0.0318
120
 
121
  ## Model description
122
 
 
135
  ### Training hyperparameters
136
 
137
  The following hyperparameters were used during training:
138
+ - learning_rate: 0.0002
139
  - train_batch_size: 16
140
+ - eval_batch_size: 1
141
  - seed: 49
142
  - gradient_accumulation_steps: 4
143
  - total_train_batch_size: 64
144
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
145
  - lr_scheduler_type: cosine
146
  - lr_scheduler_warmup_steps: 20
147
+ - num_epochs: 2
148
 
149
  ### Training results
150
 
151
  | Training Loss | Epoch | Step | Validation Loss |
152
  |:-------------:|:------:|:----:|:---------------:|
153
+ | 1.1214 | 0.0011 | 1 | 1.1843 |
154
+ | 0.0856 | 0.2501 | 225 | 0.0910 |
155
+ | 0.0599 | 0.5001 | 450 | 0.0561 |
156
+ | 0.0326 | 0.7502 | 675 | 0.0447 |
157
+ | 0.0393 | 1.0003 | 900 | 0.0372 |
158
+ | 0.0255 | 1.2503 | 1125 | 0.0341 |
159
+ | 0.0261 | 1.5004 | 1350 | 0.0324 |
160
+ | 0.0392 | 1.7505 | 1575 | 0.0318 |
161
 
162
 
163
  ### Framework versions