Edit model card

distilbert-base-uncased-finetuned-ag-news-v4

This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7194

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.012 1.0 1384 1.9480
1.978 2.0 2768 1.9294
1.9524 3.0 4152 1.9103
1.9195 4.0 5536 1.8960
1.8815 5.0 6920 1.8661
1.8658 6.0 8304 1.8516
1.8454 7.0 9688 1.8258
1.8268 8.0 11072 1.8317
1.8066 9.0 12456 1.8303
1.7911 10.0 13840 1.8111
1.7781 11.0 15224 1.8049
1.7658 12.0 16608 1.7922
1.7453 13.0 17992 1.7785
1.74 14.0 19376 1.7689
1.721 15.0 20760 1.7608
1.7096 16.0 22144 1.7563
1.706 17.0 23528 1.7515
1.6961 18.0 24912 1.7531
1.684 19.0 26296 1.7046
1.6762 20.0 27680 1.7193
1.6698 21.0 29064 1.7498
1.6638 22.0 30448 1.7177
1.6606 23.0 31832 1.7075
1.6558 24.0 33216 1.7324
1.6444 25.0 34600 1.7121
1.6453 26.0 35984 1.7068
1.6337 27.0 37368 1.7010
1.6376 28.0 38752 1.7065
1.6322 29.0 40136 1.7107
1.6293 30.0 41520 1.6902

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
21
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for miggwp/distilbert-base-uncased-finetuned-ag-news-v4

Finetuned
(6044)
this model