gokuls
/

mobilebert_sa_GLUE_Experiment_logit_kd_pretrain_stsb

+---
+license: apache-2.0
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- spearmanr
+model-index:
+- name: mobilebert_sa_GLUE_Experiment_logit_kd_pretrain_stsb
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: stsb
+      split: validation
+      args: stsb
+    metrics:
+    - name: Spearmanr
+      type: spearmanr
+      value: 0.8624005783710303
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mobilebert_sa_GLUE_Experiment_logit_kd_pretrain_stsb
+This model is a fine-tuned version of [gokuls/mobilebert_sa_pre-training-complete](https://huggingface.co/gokuls/mobilebert_sa_pre-training-complete) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.2985
+- Pearson: 0.8647
+- Spearmanr: 0.8624
+- Combined Score: 0.8636
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 128
+- eval_batch_size: 128
+- seed: 10
+- distributed_type: multi-GPU
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 50
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:---------:|:--------------:|
+| 1.1501        | 1.0   | 45   | 0.4726          | 0.7774  | 0.7922    | 0.7848         |
+| 0.364         | 2.0   | 90   | 0.3480          | 0.8457  | 0.8455    | 0.8456         |
+| 0.259         | 3.0   | 135  | 0.3156          | 0.8582  | 0.8590    | 0.8586         |
+| 0.2054        | 4.0   | 180  | 0.4231          | 0.8551  | 0.8549    | 0.8550         |
+| 0.1629        | 5.0   | 225  | 0.3245          | 0.8668  | 0.8654    | 0.8661         |
+| 0.1263        | 6.0   | 270  | 0.3192          | 0.8649  | 0.8625    | 0.8637         |
+| 0.1021        | 7.0   | 315  | 0.3337          | 0.8655  | 0.8629    | 0.8642         |
+| 0.0841        | 8.0   | 360  | 0.3061          | 0.8601  | 0.8577    | 0.8589         |
+| 0.0713        | 9.0   | 405  | 0.3600          | 0.8576  | 0.8555    | 0.8566         |
+| 0.0587        | 10.0  | 450  | 0.3135          | 0.8620  | 0.8600    | 0.8610         |
+| 0.0488        | 11.0  | 495  | 0.3006          | 0.8641  | 0.8620    | 0.8631         |
+| 0.0441        | 12.0  | 540  | 0.3308          | 0.8645  | 0.8621    | 0.8633         |
+| 0.0385        | 13.0  | 585  | 0.3468          | 0.8620  | 0.8601    | 0.8610         |
+| 0.0346        | 14.0  | 630  | 0.3175          | 0.8658  | 0.8634    | 0.8646         |
+| 0.0298        | 15.0  | 675  | 0.2919          | 0.8665  | 0.8642    | 0.8654         |
+| 0.0299        | 16.0  | 720  | 0.3103          | 0.8649  | 0.8628    | 0.8639         |
+| 0.0263        | 17.0  | 765  | 0.3325          | 0.8620  | 0.8599    | 0.8609         |
+| 0.0237        | 18.0  | 810  | 0.3092          | 0.8636  | 0.8611    | 0.8623         |
+| 0.0213        | 19.0  | 855  | 0.3169          | 0.8653  | 0.8631    | 0.8642         |
+| 0.0196        | 20.0  | 900  | 0.2985          | 0.8647  | 0.8624    | 0.8636         |
+### Framework versions
+- Transformers 4.26.0
+- Pytorch 1.14.0a0+410ce96
+- Datasets 2.9.0
+- Tokenizers 0.13.2