strickvl commited on
Commit
d594dc7
1 Parent(s): 15dd052

End of training

Browse files
Files changed (2) hide show
  1. README.md +21 -21
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -22,8 +22,8 @@ base_model: mistralai/Mistral-7B-v0.1
22
  model_type: MistralForCausalLM
23
  tokenizer_type: LlamaTokenizer
24
 
25
- load_in_8bit: false
26
- load_in_4bit: true
27
  strict: false
28
 
29
  data_seed: 42
@@ -38,7 +38,7 @@ output_dir: ./outputs/mistral/lora-out-templatefree
38
  hub_model_id: strickvl/isafpr-mistral-lora-templatefree
39
 
40
 
41
- sequence_len: 4096
42
  sample_packing: true
43
  pad_to_sequence_len: true
44
 
@@ -110,7 +110,7 @@ special_tokens:
110
 
111
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
112
  It achieves the following results on the evaluation set:
113
- - Loss: 0.0297
114
 
115
  ## Model description
116
 
@@ -147,23 +147,23 @@ The following hyperparameters were used during training:
147
 
148
  | Training Loss | Epoch | Step | Validation Loss |
149
  |:-------------:|:------:|:----:|:---------------:|
150
- | 1.4053 | 0.0276 | 1 | 1.4080 |
151
- | 0.1866 | 0.2483 | 9 | 0.1346 |
152
- | 0.0544 | 0.4966 | 18 | 0.0551 |
153
- | 0.0516 | 0.7448 | 27 | 0.0442 |
154
- | 0.0387 | 0.9931 | 36 | 0.0400 |
155
- | 0.0354 | 1.2138 | 45 | 0.0367 |
156
- | 0.0396 | 1.4621 | 54 | 0.0352 |
157
- | 0.0282 | 1.7103 | 63 | 0.0341 |
158
- | 0.0335 | 1.9586 | 72 | 0.0333 |
159
- | 0.0257 | 2.1793 | 81 | 0.0317 |
160
- | 0.0206 | 2.4276 | 90 | 0.0313 |
161
- | 0.0259 | 2.6759 | 99 | 0.0312 |
162
- | 0.024 | 2.9241 | 108 | 0.0301 |
163
- | 0.0219 | 3.1517 | 117 | 0.0300 |
164
- | 0.0221 | 3.4 | 126 | 0.0298 |
165
- | 0.0225 | 3.6483 | 135 | 0.0297 |
166
- | 0.0208 | 3.8966 | 144 | 0.0297 |
167
 
168
 
169
  ### Framework versions
 
22
  model_type: MistralForCausalLM
23
  tokenizer_type: LlamaTokenizer
24
 
25
+ load_in_8bit: true
26
+ load_in_4bit: false
27
  strict: false
28
 
29
  data_seed: 42
 
38
  hub_model_id: strickvl/isafpr-mistral-lora-templatefree
39
 
40
 
41
+ sequence_len: 2048
42
  sample_packing: true
43
  pad_to_sequence_len: true
44
 
 
110
 
111
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
112
  It achieves the following results on the evaluation set:
113
+ - Loss: 0.0288
114
 
115
  ## Model description
116
 
 
147
 
148
  | Training Loss | Epoch | Step | Validation Loss |
149
  |:-------------:|:------:|:----:|:---------------:|
150
+ | 1.5339 | 0.0131 | 1 | 1.5408 |
151
+ | 0.0671 | 0.2492 | 19 | 0.0549 |
152
+ | 0.037 | 0.4984 | 38 | 0.0406 |
153
+ | 0.0424 | 0.7475 | 57 | 0.0361 |
154
+ | 0.035 | 0.9967 | 76 | 0.0351 |
155
+ | 0.0322 | 1.2295 | 95 | 0.0336 |
156
+ | 0.0247 | 1.4787 | 114 | 0.0314 |
157
+ | 0.0229 | 1.7279 | 133 | 0.0313 |
158
+ | 0.0241 | 1.9770 | 152 | 0.0299 |
159
+ | 0.0222 | 2.2098 | 171 | 0.0307 |
160
+ | 0.0183 | 2.4590 | 190 | 0.0296 |
161
+ | 0.0205 | 2.7082 | 209 | 0.0291 |
162
+ | 0.0153 | 2.9574 | 228 | 0.0281 |
163
+ | 0.0162 | 3.1902 | 247 | 0.0286 |
164
+ | 0.0126 | 3.4393 | 266 | 0.0290 |
165
+ | 0.0147 | 3.6885 | 285 | 0.0287 |
166
+ | 0.0157 | 3.9377 | 304 | 0.0288 |
167
 
168
 
169
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:21129307a99244d1cb1aee7d135cf43959d8c94b86b4e62ff51ec7720e672542
3
  size 335706186
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a9f2c96b8754c87ccd86910df8f4514f74e07548f3f863e6c1c99422fe65200
3
  size 335706186