Model save
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
19 |
|
20 |
This model is a fine-tuned version of [imone/Mistral_7B_with_EOT_token](https://huggingface.co/imone/Mistral_7B_with_EOT_token) on the generator dataset.
|
21 |
It achieves the following results on the evaluation set:
|
22 |
-
- Loss: 0.
|
23 |
|
24 |
## Model description
|
25 |
|
@@ -38,7 +38,7 @@ More information needed
|
|
38 |
### Training hyperparameters
|
39 |
|
40 |
The following hyperparameters were used during training:
|
41 |
-
- learning_rate:
|
42 |
- train_batch_size: 16
|
43 |
- eval_batch_size: 8
|
44 |
- seed: 42
|
@@ -55,16 +55,16 @@ The following hyperparameters were used during training:
|
|
55 |
|
56 |
| Training Loss | Epoch | Step | Validation Loss |
|
57 |
|:-------------:|:-----:|:----:|:---------------:|
|
58 |
-
| 0.
|
59 |
-
| 0.
|
60 |
-
| 0.
|
61 |
-
| 0.
|
62 |
-
| 0.
|
63 |
|
64 |
|
65 |
### Framework versions
|
66 |
|
67 |
- Transformers 4.40.0
|
68 |
- Pytorch 2.1.2+cu118
|
69 |
-
- Datasets 2.
|
70 |
- Tokenizers 0.19.1
|
|
|
19 |
|
20 |
This model is a fine-tuned version of [imone/Mistral_7B_with_EOT_token](https://huggingface.co/imone/Mistral_7B_with_EOT_token) on the generator dataset.
|
21 |
It achieves the following results on the evaluation set:
|
22 |
+
- Loss: 0.3755
|
23 |
|
24 |
## Model description
|
25 |
|
|
|
38 |
### Training hyperparameters
|
39 |
|
40 |
The following hyperparameters were used during training:
|
41 |
+
- learning_rate: 5e-06
|
42 |
- train_batch_size: 16
|
43 |
- eval_batch_size: 8
|
44 |
- seed: 42
|
|
|
55 |
|
56 |
| Training Loss | Epoch | Step | Validation Loss |
|
57 |
|:-------------:|:-----:|:----:|:---------------:|
|
58 |
+
| 0.6331 | 1.0 | 578 | 0.6226 |
|
59 |
+
| 0.51 | 2.0 | 1156 | 0.5023 |
|
60 |
+
| 0.4058 | 3.0 | 1734 | 0.4172 |
|
61 |
+
| 0.3003 | 4.0 | 2312 | 0.3773 |
|
62 |
+
| 0.2508 | 5.0 | 2890 | 0.3755 |
|
63 |
|
64 |
|
65 |
### Framework versions
|
66 |
|
67 |
- Transformers 4.40.0
|
68 |
- Pytorch 2.1.2+cu118
|
69 |
+
- Datasets 2.19.0
|
70 |
- Tokenizers 0.19.1
|
config.json
CHANGED
@@ -20,7 +20,7 @@
|
|
20 |
"sliding_window": 4096,
|
21 |
"tie_word_embeddings": false,
|
22 |
"torch_dtype": "bfloat16",
|
23 |
-
"transformers_version": "4.
|
24 |
"use_cache": false,
|
25 |
"vocab_size": 32002
|
26 |
}
|
|
|
20 |
"sliding_window": 4096,
|
21 |
"tie_word_embeddings": false,
|
22 |
"torch_dtype": "bfloat16",
|
23 |
+
"transformers_version": "4.40.0",
|
24 |
"use_cache": false,
|
25 |
"vocab_size": 32002
|
26 |
}
|
model-00001-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4943178720
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8a62c7f0a48449d090c54d4bb46049f0dba2babc8fc1e00e98fdd334183666f5
|
3 |
size 4943178720
|
model-00002-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4999819336
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:02ba273e64ba948c6535ea2b1089d40576e4d52c037dfd8264be5021a989cacd
|
3 |
size 4999819336
|
model-00003-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4540532728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d67f94a9e1ee902ae2a0bc9a88ec88b9c931c92c8a2e88ce5c4fad33b59f8305
|
3 |
size 4540532728
|
runs/Apr21_16-57-33_n136-128-070/events.out.tfevents.1713690007.n136-128-070.635588.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:515bb24c66f59ef3875088ebd7aba3d82b86039f33d9b2c6ddc0fbc2a29831c0
|
3 |
+
size 128644
|
tokenizer.json
CHANGED
@@ -152,6 +152,7 @@
|
|
152 |
"end_of_word_suffix": null,
|
153 |
"fuse_unk": true,
|
154 |
"byte_fallback": true,
|
|
|
155 |
"vocab": {
|
156 |
"<unk>": 0,
|
157 |
"<s>": 1,
|
|
|
152 |
"end_of_word_suffix": null,
|
153 |
"fuse_unk": true,
|
154 |
"byte_fallback": true,
|
155 |
+
"ignore_merges": false,
|
156 |
"vocab": {
|
157 |
"<unk>": 0,
|
158 |
"<s>": 1,
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:aa332ee855a131766188374f885382875a0b64d355e2f53480c7a0d2aede4e5f
|
3 |
+
size 6200
|