metadata

license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - wikitext
metrics:
  - accuracy
model-index:
  - name: mobilebert_sa_pre-training-complete
    results:
      - task:
          name: Masked Language Modeling
          type: fill-mask
        dataset:
          name: wikitext wikitext-103-raw-v1
          type: wikitext
          config: wikitext-103-raw-v1
          split: validation
          args: wikitext-103-raw-v1
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.7161816392520737

mobilebert_sa_pre-training-complete

This model is a fine-tuned version of google/mobilebert-uncased on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:

Loss: 1.3239
Accuracy: 0.7162

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 10
distributed_type: multi-GPU
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 300000

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.6028	1.0	7145	1.4525	0.6935
1.5524	2.0	14290	1.4375	0.6993
1.5323	3.0	21435	1.4194	0.6993
1.5191	4.0	28580	1.4110	0.7027
1.5025	5.0	35725	1.4168	0.7014
1.4902	6.0	42870	1.3931	0.7012
1.4813	7.0	50015	1.3738	0.7057
1.4751	8.0	57160	1.4237	0.6996
1.4689	9.0	64305	1.3969	0.7047
1.4626	10.0	71450	1.3916	0.7068
1.4566	11.0	78595	1.3686	0.7072
1.451	12.0	85740	1.3811	0.7060
1.4478	13.0	92885	1.3598	0.7092
1.4441	14.0	100030	1.3790	0.7054
1.4379	15.0	107175	1.3794	0.7066
1.4353	16.0	114320	1.3609	0.7102
1.43	17.0	121465	1.3685	0.7083
1.4278	18.0	128610	1.3953	0.7036
1.4219	19.0	135755	1.3756	0.7085
1.4197	20.0	142900	1.3597	0.7090
1.4169	21.0	150045	1.3673	0.7061
1.4146	22.0	157190	1.3753	0.7073
1.4109	23.0	164335	1.3696	0.7082
1.4073	24.0	171480	1.3563	0.7092
1.4054	25.0	178625	1.3712	0.7103
1.402	26.0	185770	1.3528	0.7113
1.4001	27.0	192915	1.3367	0.7123
1.397	28.0	200060	1.3508	0.7118
1.3955	29.0	207205	1.3572	0.7117
1.3937	30.0	214350	1.3566	0.7095
1.3901	31.0	221495	1.3515	0.7117
1.3874	32.0	228640	1.3445	0.7118
1.386	33.0	235785	1.3611	0.7097
1.3833	34.0	242930	1.3502	0.7087
1.3822	35.0	250075	1.3657	0.7108
1.3797	36.0	257220	1.3576	0.7108
1.3793	37.0	264365	1.3472	0.7106
1.3763	38.0	271510	1.3323	0.7156
1.3762	39.0	278655	1.3325	0.7145
1.3748	40.0	285800	1.3243	0.7138
1.3733	41.0	292945	1.3218	0.7170
1.3722	41.99	300000	1.3074	0.7186

Framework versions

Transformers 4.26.0
Pytorch 1.14.0a0+410ce96
Datasets 2.9.0
Tokenizers 0.13.2