DistilBERT base uncased finetuned SST-2

This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2. This model reaches an accuracy of 91.3 on the dev set (for comparison, Bert bert-base-uncased version reaches an accuracy of 92.7).

Fine-tuning hyper-parameters

  • learning_rate = 1e-5
  • batch_size = 32
  • warmup = 600
  • max_seq_length = 128
  • num_train_epochs = 3.0
Downloads last month
3,096,349
Hosted inference API
Text Classification
This model can be loaded on the Inference API on-demand.