param-bharat commited on
Commit
b557a7b
1 Parent(s): 85c4648

Model save

Browse files
Files changed (3) hide show
  1. README.md +9 -7
  2. generation_config.json +1 -0
  3. model.safetensors +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.7844
20
 
21
  ## Model description
22
 
@@ -47,17 +47,19 @@ The following hyperparameters were used during training:
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_ratio: 0.1
50
- - num_epochs: 2
51
 
52
  ### Training results
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:------:|:----:|:---------------:|
56
- | 0.9921 | 0.3853 | 500 | 0.8058 |
57
- | 0.9685 | 0.7706 | 1000 | 0.7897 |
58
- | 1.0532 | 1.1558 | 1500 | 0.7858 |
59
- | 1.0206 | 1.5411 | 2000 | 0.7847 |
60
- | 1.0418 | 1.9264 | 2500 | 0.7844 |
 
 
61
 
62
 
63
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.7045
20
 
21
  ## Model description
22
 
 
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_ratio: 0.1
50
+ - num_epochs: 3
51
 
52
  ### Training results
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:------:|:----:|:---------------:|
56
+ | 0.916 | 0.3854 | 500 | 0.7372 |
57
+ | 0.8854 | 0.7707 | 1000 | 0.7177 |
58
+ | 0.9783 | 1.1562 | 1500 | 0.7117 |
59
+ | 0.9635 | 1.5415 | 2000 | 0.7066 |
60
+ | 0.9591 | 1.9269 | 2500 | 0.7046 |
61
+ | 0.8954 | 2.3123 | 3000 | 0.7044 |
62
+ | 0.8896 | 2.6977 | 3500 | 0.7045 |
63
 
64
 
65
  ### Framework versions
generation_config.json CHANGED
@@ -2,6 +2,7 @@
2
  "_from_model_config": true,
3
  "bos_token_id": 1,
4
  "eos_token_id": 2,
 
5
  "pad_token_id": 2,
6
  "transformers_version": "4.46.3"
7
  }
 
2
  "_from_model_config": true,
3
  "bos_token_id": 1,
4
  "eos_token_id": 2,
5
+ "max_length": 8192,
6
  "pad_token_id": 2,
7
  "transformers_version": "4.46.3"
8
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:349e712af0a335471bfd3e3f0386bcb331ffd267d6856ba56e6301dd1bb560ea
3
  size 269060552
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd3afc8f8d7e10842b01c07fde2c75269021f0a4e0c2b09842a8edd0c71623a8
3
  size 269060552