distilbert-qasports / README.md
laurafcamargos's picture
Training in progress, step 500
24393e2 verified
|
raw
history blame
4.7 kB
metadata
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased-distilled-squad
tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: distilbert-qasports
    results: []

distilbert-qasports

This model is a fine-tuned version of distilbert-base-uncased-distilled-squad on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4015
  • Exact: 76.8499
  • F1: 81.2744
  • Total: 15041
  • Hasans Exact: 76.8499
  • Hasans F1: 81.2744
  • Hasans Total: 15041
  • Best Exact: 76.8499
  • Best Exact Thresh: 0.0
  • Best F1: 81.2744
  • Best F1 Thresh: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Exact F1 Total Hasans Exact Hasans F1 Hasans Total Best Exact Best Exact Thresh Best F1 Best F1 Thresh
0.6883 0.1325 500 0.6008 74.7690 79.7676 15041 74.7690 79.7676 15041 74.7690 0.0 79.7676 0.0
0.5738 0.2649 1000 0.5474 75.3407 80.2816 15041 75.3407 80.2816 15041 75.3407 0.0 80.2816 0.0
0.5853 0.3974 1500 0.5259 75.3873 80.2217 15041 75.3873 80.2217 15041 75.3873 0.0 80.2217 0.0
0.588 0.5298 2000 0.4904 76.3978 81.0881 15041 76.3978 81.0881 15041 76.3978 0.0 81.0881 0.0
0.5214 0.6623 2500 0.4764 76.8366 81.4327 15041 76.8366 81.4327 15041 76.8366 0.0 81.4327 0.0
0.4813 0.7947 3000 0.4586 76.9763 81.6042 15041 76.9763 81.6042 15041 76.9763 0.0 81.6042 0.0
0.5032 0.9272 3500 0.4323 76.7835 81.4041 15041 76.7835 81.4041 15041 76.7835 0.0 81.4041 0.0
0.3549 1.0596 4000 0.4349 76.8632 81.2899 15041 76.8632 81.2899 15041 76.8632 0.0 81.2899 0.0
0.4053 1.1921 4500 0.4199 76.9630 81.3741 15041 76.9630 81.3741 15041 76.9630 0.0 81.3741 0.0
0.3549 1.3245 5000 0.4372 77.0427 81.6167 15041 77.0427 81.6167 15041 77.0427 0.0 81.6167 0.0
0.3707 1.4570 5500 0.4254 77.0560 81.5058 15041 77.0560 81.5058 15041 77.0560 0.0 81.5058 0.0
0.3728 1.5894 6000 0.4086 76.9031 81.4012 15041 76.9031 81.4012 15041 76.9031 0.0 81.4012 0.0
0.4117 1.7219 6500 0.4029 76.8233 81.4108 15041 76.8233 81.4108 15041 76.8233 0.0 81.4108 0.0
0.3785 1.8543 7000 0.3979 77.0427 81.4664 15041 77.0427 81.4664 15041 77.0427 0.0 81.4664 0.0
0.3564 1.9868 7500 0.4015 76.8499 81.2744 15041 76.8499 81.2744 15041 76.8499 0.0 81.2744 0.0

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.0.0+cu117
  • Datasets 3.2.0
  • Tokenizers 0.21.0