End of training
Browse files- README.md +21 -21
- adapter_model.bin +1 -1
README.md
CHANGED
@@ -22,8 +22,8 @@ base_model: mistralai/Mistral-7B-v0.1
|
|
22 |
model_type: MistralForCausalLM
|
23 |
tokenizer_type: LlamaTokenizer
|
24 |
|
25 |
-
load_in_8bit:
|
26 |
-
load_in_4bit:
|
27 |
strict: false
|
28 |
|
29 |
data_seed: 42
|
@@ -38,7 +38,7 @@ output_dir: ./outputs/mistral/lora-out-templatefree
|
|
38 |
hub_model_id: strickvl/isafpr-mistral-lora-templatefree
|
39 |
|
40 |
|
41 |
-
sequence_len:
|
42 |
sample_packing: true
|
43 |
pad_to_sequence_len: true
|
44 |
|
@@ -110,7 +110,7 @@ special_tokens:
|
|
110 |
|
111 |
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
|
112 |
It achieves the following results on the evaluation set:
|
113 |
-
- Loss: 0.
|
114 |
|
115 |
## Model description
|
116 |
|
@@ -147,23 +147,23 @@ The following hyperparameters were used during training:
|
|
147 |
|
148 |
| Training Loss | Epoch | Step | Validation Loss |
|
149 |
|:-------------:|:------:|:----:|:---------------:|
|
150 |
-
| 1.
|
151 |
-
| 0.
|
152 |
-
| 0.
|
153 |
-
| 0.
|
154 |
-
| 0.
|
155 |
-
| 0.
|
156 |
-
| 0.
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 0.
|
163 |
-
| 0.
|
164 |
-
| 0.
|
165 |
-
| 0.
|
166 |
-
| 0.
|
167 |
|
168 |
|
169 |
### Framework versions
|
|
|
22 |
model_type: MistralForCausalLM
|
23 |
tokenizer_type: LlamaTokenizer
|
24 |
|
25 |
+
load_in_8bit: true
|
26 |
+
load_in_4bit: false
|
27 |
strict: false
|
28 |
|
29 |
data_seed: 42
|
|
|
38 |
hub_model_id: strickvl/isafpr-mistral-lora-templatefree
|
39 |
|
40 |
|
41 |
+
sequence_len: 2048
|
42 |
sample_packing: true
|
43 |
pad_to_sequence_len: true
|
44 |
|
|
|
110 |
|
111 |
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
|
112 |
It achieves the following results on the evaluation set:
|
113 |
+
- Loss: 0.0288
|
114 |
|
115 |
## Model description
|
116 |
|
|
|
147 |
|
148 |
| Training Loss | Epoch | Step | Validation Loss |
|
149 |
|:-------------:|:------:|:----:|:---------------:|
|
150 |
+
| 1.5339 | 0.0131 | 1 | 1.5408 |
|
151 |
+
| 0.0671 | 0.2492 | 19 | 0.0549 |
|
152 |
+
| 0.037 | 0.4984 | 38 | 0.0406 |
|
153 |
+
| 0.0424 | 0.7475 | 57 | 0.0361 |
|
154 |
+
| 0.035 | 0.9967 | 76 | 0.0351 |
|
155 |
+
| 0.0322 | 1.2295 | 95 | 0.0336 |
|
156 |
+
| 0.0247 | 1.4787 | 114 | 0.0314 |
|
157 |
+
| 0.0229 | 1.7279 | 133 | 0.0313 |
|
158 |
+
| 0.0241 | 1.9770 | 152 | 0.0299 |
|
159 |
+
| 0.0222 | 2.2098 | 171 | 0.0307 |
|
160 |
+
| 0.0183 | 2.4590 | 190 | 0.0296 |
|
161 |
+
| 0.0205 | 2.7082 | 209 | 0.0291 |
|
162 |
+
| 0.0153 | 2.9574 | 228 | 0.0281 |
|
163 |
+
| 0.0162 | 3.1902 | 247 | 0.0286 |
|
164 |
+
| 0.0126 | 3.4393 | 266 | 0.0290 |
|
165 |
+
| 0.0147 | 3.6885 | 285 | 0.0287 |
|
166 |
+
| 0.0157 | 3.9377 | 304 | 0.0288 |
|
167 |
|
168 |
|
169 |
### Framework versions
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 335706186
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1a9f2c96b8754c87ccd86910df8f4514f74e07548f3f863e6c1c99422fe65200
|
3 |
size 335706186
|