qa_model / README.md
bytesizedllm's picture
Model save
4d9175f verified
|
raw
history blame
6.34 kB
metadata
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - generated_from_trainer
model-index:
  - name: qa_model
    results: []

qa_model

This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0737

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 14 3.6840
No log 2.0 28 1.5996
No log 3.0 42 0.9961
No log 4.0 56 0.7927
No log 5.0 70 0.6597
No log 6.0 84 0.5352
No log 7.0 98 0.4412
No log 8.0 112 0.3435
No log 9.0 126 0.2955
No log 10.0 140 0.2741
No log 11.0 154 0.2211
No log 12.0 168 0.1959
No log 13.0 182 0.1783
No log 14.0 196 0.1919
No log 15.0 210 0.1640
No log 16.0 224 0.1439
No log 17.0 238 0.1479
No log 18.0 252 0.1536
No log 19.0 266 0.1365
No log 20.0 280 0.1444
No log 21.0 294 0.1268
No log 22.0 308 0.1330
No log 23.0 322 0.1192
No log 24.0 336 0.1254
No log 25.0 350 0.1168
No log 26.0 364 0.1099
No log 27.0 378 0.1077
No log 28.0 392 0.1134
No log 29.0 406 0.1039
No log 30.0 420 0.1293
No log 31.0 434 0.1211
No log 32.0 448 0.0997
No log 33.0 462 0.1052
No log 34.0 476 0.1067
No log 35.0 490 0.0974
0.5014 36.0 504 0.0987
0.5014 37.0 518 0.0955
0.5014 38.0 532 0.0938
0.5014 39.0 546 0.0894
0.5014 40.0 560 0.0873
0.5014 41.0 574 0.0943
0.5014 42.0 588 0.0917
0.5014 43.0 602 0.0869
0.5014 44.0 616 0.0896
0.5014 45.0 630 0.0857
0.5014 46.0 644 0.0889
0.5014 47.0 658 0.0854
0.5014 48.0 672 0.0896
0.5014 49.0 686 0.0848
0.5014 50.0 700 0.0882
0.5014 51.0 714 0.0840
0.5014 52.0 728 0.0826
0.5014 53.0 742 0.0843
0.5014 54.0 756 0.0823
0.5014 55.0 770 0.0805
0.5014 56.0 784 0.0799
0.5014 57.0 798 0.0776
0.5014 58.0 812 0.0775
0.5014 59.0 826 0.0776
0.5014 60.0 840 0.0761
0.5014 61.0 854 0.0756
0.5014 62.0 868 0.0764
0.5014 63.0 882 0.0768
0.5014 64.0 896 0.0764
0.5014 65.0 910 0.0770
0.5014 66.0 924 0.0766
0.5014 67.0 938 0.0776
0.5014 68.0 952 0.0752
0.5014 69.0 966 0.0762
0.5014 70.0 980 0.0764
0.5014 71.0 994 0.0747
0.0961 72.0 1008 0.0762
0.0961 73.0 1022 0.0767
0.0961 74.0 1036 0.0766
0.0961 75.0 1050 0.0767
0.0961 76.0 1064 0.0755
0.0961 77.0 1078 0.0755
0.0961 78.0 1092 0.0751
0.0961 79.0 1106 0.0747
0.0961 80.0 1120 0.0756
0.0961 81.0 1134 0.0752
0.0961 82.0 1148 0.0751
0.0961 83.0 1162 0.0749
0.0961 84.0 1176 0.0748
0.0961 85.0 1190 0.0744
0.0961 86.0 1204 0.0742
0.0961 87.0 1218 0.0747
0.0961 88.0 1232 0.0745
0.0961 89.0 1246 0.0739
0.0961 90.0 1260 0.0738
0.0961 91.0 1274 0.0739
0.0961 92.0 1288 0.0740
0.0961 93.0 1302 0.0738
0.0961 94.0 1316 0.0738
0.0961 95.0 1330 0.0737
0.0961 96.0 1344 0.0736
0.0961 97.0 1358 0.0737
0.0961 98.0 1372 0.0737
0.0961 99.0 1386 0.0737
0.0961 100.0 1400 0.0737

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.0.1+cu118
  • Datasets 2.18.0
  • Tokenizers 0.15.2