tcapelle commited on
Commit
1ff18b8
1 Parent(s): 9f8c1bf

Model save

Browse files
Files changed (2) hide show
  1. README.md +10 -21
  2. generation_config.json +3 -2
README.md CHANGED
@@ -4,11 +4,6 @@ license: apache-2.0
4
  base_model: HuggingFaceTB/SmolLM2-135M-Instruct
5
  tags:
6
  - generated_from_trainer
7
- metrics:
8
- - f1
9
- - accuracy
10
- - precision
11
- - recall
12
  model-index:
13
  - name: toxicity-scorer-smollm2-135m-it-freeze
14
  results: []
@@ -21,11 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
21
 
22
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) on an unknown dataset.
23
  It achieves the following results on the evaluation set:
24
- - Loss: 0.3147
25
- - F1: 0.8264
26
- - Accuracy: 0.8745
27
- - Precision: 0.8384
28
- - Recall: 0.8745
29
 
30
  ## Model description
31
 
@@ -45,13 +36,9 @@ More information needed
45
 
46
  The following hyperparameters were used during training:
47
  - learning_rate: 3e-05
48
- - train_batch_size: 44
49
- - eval_batch_size: 44
50
  - seed: 42
51
- - distributed_type: multi-GPU
52
- - num_devices: 8
53
- - total_train_batch_size: 352
54
- - total_eval_batch_size: 352
55
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
56
  - lr_scheduler_type: cosine
57
  - lr_scheduler_warmup_ratio: 0.1
@@ -59,15 +46,17 @@ The following hyperparameters were used during training:
59
 
60
  ### Training results
61
 
62
- | Training Loss | Epoch | Step | Validation Loss | F1 | Accuracy | Precision | Recall |
63
- |:-------------:|:------:|:----:|:---------------:|:------:|:--------:|:---------:|:------:|
64
- | No log | 0 | 0 | 0.9227 | 0.6386 | 0.5685 | 0.7480 | 0.5685 |
65
- | 0.3196 | 1.5596 | 5000 | 0.3147 | 0.8264 | 0.8745 | 0.8384 | 0.8745 |
 
 
66
 
67
 
68
  ### Framework versions
69
 
70
  - Transformers 4.46.3
71
- - Pytorch 2.5.1
72
  - Datasets 3.1.0
73
  - Tokenizers 0.20.3
 
4
  base_model: HuggingFaceTB/SmolLM2-135M-Instruct
5
  tags:
6
  - generated_from_trainer
 
 
 
 
 
7
  model-index:
8
  - name: toxicity-scorer-smollm2-135m-it-freeze
9
  results: []
 
16
 
17
  This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 3.7842
 
 
 
 
20
 
21
  ## Model description
22
 
 
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 3e-05
39
+ - train_batch_size: 16
40
+ - eval_batch_size: 16
41
  - seed: 42
 
 
 
 
42
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: cosine
44
  - lr_scheduler_warmup_ratio: 0.1
 
46
 
47
  ### Training results
48
 
49
+ | Training Loss | Epoch | Step | Validation Loss |
50
+ |:-------------:|:-----:|:----:|:---------------:|
51
+ | No log | 0 | 0 | 3.8389 |
52
+ | 3.8172 | 1.0 | 63 | 3.8037 |
53
+ | 3.7957 | 2.0 | 126 | 3.7857 |
54
+ | 3.7137 | 3.0 | 189 | 3.7842 |
55
 
56
 
57
  ### Framework versions
58
 
59
  - Transformers 4.46.3
60
+ - Pytorch 2.5.1+cu124
61
  - Datasets 3.1.0
62
  - Tokenizers 0.20.3
generation_config.json CHANGED
@@ -1,6 +1,7 @@
1
  {
2
  "_from_model_config": true,
3
- "bos_token_id": 0,
4
- "eos_token_id": 0,
 
5
  "transformers_version": "4.46.3"
6
  }
 
1
  {
2
  "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "pad_token_id": 2,
6
  "transformers_version": "4.46.3"
7
  }