qa_model / README.md
bytesizedllm's picture
Model save
7217b8d verified
|
raw
history blame
6.34 kB
metadata
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - generated_from_trainer
model-index:
  - name: qa_model
    results: []

qa_model

This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0785

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 14 0.1091
No log 2.0 28 0.1130
No log 3.0 42 0.1122
No log 4.0 56 0.1073
No log 5.0 70 0.0929
No log 6.0 84 0.0910
No log 7.0 98 0.0926
No log 8.0 112 0.1022
No log 9.0 126 0.0937
No log 10.0 140 0.0975
No log 11.0 154 0.0950
No log 12.0 168 0.1064
No log 13.0 182 0.1137
No log 14.0 196 0.0951
No log 15.0 210 0.1074
No log 16.0 224 0.1007
No log 17.0 238 0.0919
No log 18.0 252 0.0859
No log 19.0 266 0.1020
No log 20.0 280 0.0830
No log 21.0 294 0.0839
No log 22.0 308 0.0834
No log 23.0 322 0.0824
No log 24.0 336 0.0837
No log 25.0 350 0.0915
No log 26.0 364 0.0918
No log 27.0 378 0.0827
No log 28.0 392 0.0824
No log 29.0 406 0.0816
No log 30.0 420 0.0904
No log 31.0 434 0.0872
No log 32.0 448 0.0810
No log 33.0 462 0.0817
No log 34.0 476 0.0841
No log 35.0 490 0.0826
0.1061 36.0 504 0.0847
0.1061 37.0 518 0.0830
0.1061 38.0 532 0.0817
0.1061 39.0 546 0.0833
0.1061 40.0 560 0.0810
0.1061 41.0 574 0.0859
0.1061 42.0 588 0.0811
0.1061 43.0 602 0.0802
0.1061 44.0 616 0.0807
0.1061 45.0 630 0.0806
0.1061 46.0 644 0.0809
0.1061 47.0 658 0.0800
0.1061 48.0 672 0.0793
0.1061 49.0 686 0.0801
0.1061 50.0 700 0.0794
0.1061 51.0 714 0.0836
0.1061 52.0 728 0.0813
0.1061 53.0 742 0.0803
0.1061 54.0 756 0.0791
0.1061 55.0 770 0.0798
0.1061 56.0 784 0.0811
0.1061 57.0 798 0.0811
0.1061 58.0 812 0.0801
0.1061 59.0 826 0.0800
0.1061 60.0 840 0.0795
0.1061 61.0 854 0.0796
0.1061 62.0 868 0.0796
0.1061 63.0 882 0.0799
0.1061 64.0 896 0.0793
0.1061 65.0 910 0.0791
0.1061 66.0 924 0.0790
0.1061 67.0 938 0.0790
0.1061 68.0 952 0.0789
0.1061 69.0 966 0.0790
0.1061 70.0 980 0.0790
0.1061 71.0 994 0.0789
0.088 72.0 1008 0.0789
0.088 73.0 1022 0.0789
0.088 74.0 1036 0.0788
0.088 75.0 1050 0.0788
0.088 76.0 1064 0.0788
0.088 77.0 1078 0.0787
0.088 78.0 1092 0.0787
0.088 79.0 1106 0.0787
0.088 80.0 1120 0.0786
0.088 81.0 1134 0.0787
0.088 82.0 1148 0.0790
0.088 83.0 1162 0.0787
0.088 84.0 1176 0.0787
0.088 85.0 1190 0.0787
0.088 86.0 1204 0.0787
0.088 87.0 1218 0.0789
0.088 88.0 1232 0.0789
0.088 89.0 1246 0.0789
0.088 90.0 1260 0.0788
0.088 91.0 1274 0.0788
0.088 92.0 1288 0.0786
0.088 93.0 1302 0.0786
0.088 94.0 1316 0.0785
0.088 95.0 1330 0.0785
0.088 96.0 1344 0.0785
0.088 97.0 1358 0.0785
0.088 98.0 1372 0.0785
0.088 99.0 1386 0.0785
0.088 100.0 1400 0.0785

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.0.1+cu118
  • Datasets 2.18.0
  • Tokenizers 0.15.2