tcapelle
/

toxicity-scorer-smollm2-135m-it-freeze

@@ -1,20 +1,31 @@
 ---
 library_name: transformers
 license: apache-2.0
-base_model: HuggingFaceTB/SmolLM2-135M-Instruct
 tags:
 - generated_from_trainer
 model-index:
-- name: toxicity-scorer-smollm2-135m-it-freeze
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# toxicity-scorer-smollm2-135m-it-freeze
-This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) on an unknown dataset.
 ## Model description
@@ -34,9 +45,13 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 3e-05
-- train_batch_size: 36
-- eval_batch_size: 36
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
@@ -44,14 +59,15 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | F1     | Accuracy | Precision | Recall |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:--------:|:---------:|:------:|
-| No log        | 0     | 0    | 1.4368          | 0.2774 | 0.2376   | 0.7152    | 0.2376 |
 ### Framework versions
 - Transformers 4.46.3
-- Pytorch 2.5.1+cu124
 - Datasets 3.1.0
 - Tokenizers 0.20.3

 ---
 library_name: transformers
 license: apache-2.0
+base_model: HuggingFaceTB/SmolLM2-135M
 tags:
 - generated_from_trainer
+metrics:
+- f1
+- accuracy
+- precision
+- recall
 model-index:
+- name: toxicity-scorer-smollm2-135m-freeze
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# toxicity-scorer-smollm2-135m-freeze
+This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.2939
+- F1: 0.8645
+- Accuracy: 0.8847
+- Precision: 0.8636
+- Recall: 0.8847
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 3e-05
+- train_batch_size: 44
+- eval_batch_size: 44
 - seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- total_train_batch_size: 352
+- total_eval_batch_size: 352
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | F1     | Accuracy | Precision | Recall |
+|:-------------:|:------:|:----:|:---------------:|:------:|:--------:|:---------:|:------:|
+| No log        | 0      | 0    | 1.5154          | 0.4459 | 0.3797   | 0.8101    | 0.3797 |
+| 0.3001        | 1.5596 | 5000 | 0.2939          | 0.8645 | 0.8847   | 0.8636    | 0.8847 |
 ### Framework versions
 - Transformers 4.46.3
+- Pytorch 2.5.1
 - Datasets 3.1.0
 - Tokenizers 0.20.3

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:630525e069d814a8df2ba87f1418cc122bbacb7f3a36396e73da9a9c46500829
 size 269066392

 version https://git-lfs.github.com/spec/v1
+oid sha256:71d68a8cfdb746cda33b766aadfcc27f5bdff573c21f2b9f5d42222b845d5f79
 size 269066392

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:10c7ee103676265d63de55ab8be1069bb28b32cda292e32ce6599a1b5923b8d7
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:95a1a594d33b749b53bb61c2db111cf02d29482bcff90dfbebd6f9e2c919a80e
 size 5304