ilana
/

tiny-bert-sst2-distilled

Text Classification

generated_from_trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

tiny-bert-sst2-distilled

This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the glue dataset. It achieves the following results on the evaluation set:

eval_loss: 3.0017
eval_accuracy: 0.7477
eval_runtime: 0.3985
eval_samples_per_second: 2188.296
eval_steps_per_second: 17.567
epoch: 1.0
step: 527

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6.708803333901887e-05
train_batch_size: 128
eval_batch_size: 128
seed: 33
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Framework versions

Transformers 4.20.1
Pytorch 1.11.0
Datasets 2.3.2
Tokenizers 0.12.1

Downloads last month: 12

Dataset used to train ilana/tiny-bert-sst2-distilled

Evaluation results

Metadata error: specify a dataset to view leaderboard