End of training
Browse files- README.md +24 -5
- adapter_model.bin +2 -2
README.md
CHANGED
@@ -6,7 +6,7 @@ tags:
|
|
6 |
- generated_from_trainer
|
7 |
base_model: mistralai/Mistral-7B-v0.1
|
8 |
model-index:
|
9 |
-
- name:
|
10 |
results: []
|
11 |
---
|
12 |
|
@@ -27,9 +27,9 @@ load_in_8bit: false
|
|
27 |
load_in_4bit: true
|
28 |
strict: false
|
29 |
|
30 |
-
hub_model_id: nitsw/
|
31 |
datasets:
|
32 |
-
- path:
|
33 |
type: alpaca
|
34 |
dataset_prepared_path: last_run_prepared
|
35 |
val_set_size: 0.1
|
@@ -56,7 +56,7 @@ lora_target_modules:
|
|
56 |
- k_proj
|
57 |
- o_proj
|
58 |
|
59 |
-
wandb_project:
|
60 |
wandb_entity:
|
61 |
wandb_watch:
|
62 |
wandb_name:
|
@@ -107,9 +107,11 @@ special_tokens:
|
|
107 |
|
108 |
</details><br>
|
109 |
|
110 |
-
#
|
111 |
|
112 |
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
|
|
|
|
|
113 |
|
114 |
## Model description
|
115 |
|
@@ -141,6 +143,23 @@ The following hyperparameters were used during training:
|
|
141 |
|
142 |
### Training results
|
143 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
144 |
|
145 |
|
146 |
### Framework versions
|
|
|
6 |
- generated_from_trainer
|
7 |
base_model: mistralai/Mistral-7B-v0.1
|
8 |
model-index:
|
9 |
+
- name: mistral_axonotll
|
10 |
results: []
|
11 |
---
|
12 |
|
|
|
27 |
load_in_4bit: true
|
28 |
strict: false
|
29 |
|
30 |
+
hub_model_id: nitsw/mistral_axonotll
|
31 |
datasets:
|
32 |
+
- path: nitsw/alpaca_cleaned
|
33 |
type: alpaca
|
34 |
dataset_prepared_path: last_run_prepared
|
35 |
val_set_size: 0.1
|
|
|
56 |
- k_proj
|
57 |
- o_proj
|
58 |
|
59 |
+
wandb_project: swapnil_axolotl
|
60 |
wandb_entity:
|
61 |
wandb_watch:
|
62 |
wandb_name:
|
|
|
107 |
|
108 |
</details><br>
|
109 |
|
110 |
+
# mistral_axonotll
|
111 |
|
112 |
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
|
113 |
+
It achieves the following results on the evaluation set:
|
114 |
+
- Loss: 0.8484
|
115 |
|
116 |
## Model description
|
117 |
|
|
|
143 |
|
144 |
### Training results
|
145 |
|
146 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
147 |
+
|:-------------:|:-----:|:----:|:---------------:|
|
148 |
+
| 0.8523 | 0.06 | 10 | 0.8987 |
|
149 |
+
| 0.8882 | 0.13 | 20 | 0.8766 |
|
150 |
+
| 0.8374 | 0.19 | 30 | 0.8683 |
|
151 |
+
| 0.8223 | 0.25 | 40 | 0.8636 |
|
152 |
+
| 0.85 | 0.32 | 50 | 0.8604 |
|
153 |
+
| 0.8425 | 0.38 | 60 | 0.8577 |
|
154 |
+
| 0.8572 | 0.44 | 70 | 0.8560 |
|
155 |
+
| 0.8427 | 0.51 | 80 | 0.8539 |
|
156 |
+
| 0.8627 | 0.57 | 90 | 0.8526 |
|
157 |
+
| 0.8242 | 0.63 | 100 | 0.8512 |
|
158 |
+
| 0.8555 | 0.7 | 110 | 0.8501 |
|
159 |
+
| 0.8348 | 0.76 | 120 | 0.8495 |
|
160 |
+
| 0.8593 | 0.83 | 130 | 0.8488 |
|
161 |
+
| 0.8403 | 0.89 | 140 | 0.8485 |
|
162 |
+
| 0.8628 | 0.95 | 150 | 0.8484 |
|
163 |
|
164 |
|
165 |
### Framework versions
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:30f3a3dae4afd97fd3a67fd4f5594a28dc463f575cb0a49a4e42ed6b0a0e64f3
|
3 |
+
size 335705741
|