File size: 3,631 Bytes

d6823be
dec1b0c
 
d6823be
 
 
 
 
 
 
 
 
 
 
 
 
 
dec1b0c
d6823be
 
 
 
 
 
 
dec1b0c
d6823be
 
 
 
 
 
 
dec1b0c
d6823be
dec1b0c
 
 
 
d6823be

---
language:
- en
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- glue
metrics:
- spearmanr
model-index:
- name: mobilebert_sa_GLUE_Experiment_logit_kd_pretrain_stsb
  results:
  - task:
      name: Text Classification
      type: text-classification
    dataset:
      name: GLUE STSB
      type: glue
      config: stsb
      split: validation
      args: stsb
    metrics:
    - name: Spearmanr
      type: spearmanr
      value: 0.8642221596976783
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mobilebert_sa_GLUE_Experiment_logit_kd_pretrain_stsb

This model is a fine-tuned version of [gokuls/mobilebert_sa_pre-training-complete](https://huggingface.co/gokuls/mobilebert_sa_pre-training-complete) on the GLUE STSB dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2919
- Pearson: 0.8665
- Spearmanr: 0.8642
- Combined Score: 0.8654

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 128
- eval_batch_size: 128
- seed: 10
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50

### Training results

| Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:---------:|:--------------:|
| 1.1501        | 1.0   | 45   | 0.4726          | 0.7774  | 0.7922    | 0.7848         |
| 0.364         | 2.0   | 90   | 0.3480          | 0.8457  | 0.8455    | 0.8456         |
| 0.259         | 3.0   | 135  | 0.3156          | 0.8582  | 0.8590    | 0.8586         |
| 0.2054        | 4.0   | 180  | 0.4231          | 0.8551  | 0.8549    | 0.8550         |
| 0.1629        | 5.0   | 225  | 0.3245          | 0.8668  | 0.8654    | 0.8661         |
| 0.1263        | 6.0   | 270  | 0.3192          | 0.8649  | 0.8625    | 0.8637         |
| 0.1021        | 7.0   | 315  | 0.3337          | 0.8655  | 0.8629    | 0.8642         |
| 0.0841        | 8.0   | 360  | 0.3061          | 0.8601  | 0.8577    | 0.8589         |
| 0.0713        | 9.0   | 405  | 0.3600          | 0.8576  | 0.8555    | 0.8566         |
| 0.0587        | 10.0  | 450  | 0.3135          | 0.8620  | 0.8600    | 0.8610         |
| 0.0488        | 11.0  | 495  | 0.3006          | 0.8641  | 0.8620    | 0.8631         |
| 0.0441        | 12.0  | 540  | 0.3308          | 0.8645  | 0.8621    | 0.8633         |
| 0.0385        | 13.0  | 585  | 0.3468          | 0.8620  | 0.8601    | 0.8610         |
| 0.0346        | 14.0  | 630  | 0.3175          | 0.8658  | 0.8634    | 0.8646         |
| 0.0298        | 15.0  | 675  | 0.2919          | 0.8665  | 0.8642    | 0.8654         |
| 0.0299        | 16.0  | 720  | 0.3103          | 0.8649  | 0.8628    | 0.8639         |
| 0.0263        | 17.0  | 765  | 0.3325          | 0.8620  | 0.8599    | 0.8609         |
| 0.0237        | 18.0  | 810  | 0.3092          | 0.8636  | 0.8611    | 0.8623         |
| 0.0213        | 19.0  | 855  | 0.3169          | 0.8653  | 0.8631    | 0.8642         |
| 0.0196        | 20.0  | 900  | 0.2985          | 0.8647  | 0.8624    | 0.8636         |


### Framework versions

- Transformers 4.26.0
- Pytorch 1.14.0a0+410ce96
- Datasets 2.9.0
- Tokenizers 0.13.2