Edit model card

MiniProject_Prescription_Chatbot

This model is a fine-tuned version of distilbert/distilgpt2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6475

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 12 3.8781
No log 2.0 24 3.7741
No log 3.0 36 3.6911
No log 4.0 48 3.6233
No log 5.0 60 3.5601
No log 6.0 72 3.5104
No log 7.0 84 3.4804
No log 8.0 96 3.4457
No log 9.0 108 3.4133
No log 10.0 120 3.4018
No log 11.0 132 3.3834
No log 12.0 144 3.3487
No log 13.0 156 3.3486
No log 14.0 168 3.3230
No log 15.0 180 3.3198
No log 16.0 192 3.2984
No log 17.0 204 3.3169
No log 18.0 216 3.2786
No log 19.0 228 3.3034
No log 20.0 240 3.2695
No log 21.0 252 3.2597
No log 22.0 264 3.2644
No log 23.0 276 3.2610
No log 24.0 288 3.2862
No log 25.0 300 3.2750
No log 26.0 312 3.2505
No log 27.0 324 3.2844
No log 28.0 336 3.2729
No log 29.0 348 3.2894
No log 30.0 360 3.2875
No log 31.0 372 3.2735
No log 32.0 384 3.2998
No log 33.0 396 3.3070
No log 34.0 408 3.2893
No log 35.0 420 3.2935
No log 36.0 432 3.3057
No log 37.0 444 3.3028
No log 38.0 456 3.3239
No log 39.0 468 3.3158
No log 40.0 480 3.3249
No log 41.0 492 3.3595
2.5614 42.0 504 3.3610
2.5614 43.0 516 3.3546
2.5614 44.0 528 3.3815
2.5614 45.0 540 3.3620
2.5614 46.0 552 3.3823
2.5614 47.0 564 3.3800
2.5614 48.0 576 3.4000
2.5614 49.0 588 3.4191
2.5614 50.0 600 3.4093
2.5614 51.0 612 3.4162
2.5614 52.0 624 3.4197
2.5614 53.0 636 3.4370
2.5614 54.0 648 3.4442
2.5614 55.0 660 3.4767
2.5614 56.0 672 3.4642
2.5614 57.0 684 3.4780
2.5614 58.0 696 3.4808
2.5614 59.0 708 3.4712
2.5614 60.0 720 3.5279
2.5614 61.0 732 3.4993
2.5614 62.0 744 3.4865
2.5614 63.0 756 3.5209
2.5614 64.0 768 3.5196
2.5614 65.0 780 3.5359
2.5614 66.0 792 3.5089
2.5614 67.0 804 3.5489
2.5614 68.0 816 3.5528
2.5614 69.0 828 3.5587
2.5614 70.0 840 3.5606
2.5614 71.0 852 3.5719
2.5614 72.0 864 3.5776
2.5614 73.0 876 3.5700
2.5614 74.0 888 3.5825
2.5614 75.0 900 3.5779
2.5614 76.0 912 3.5934
2.5614 77.0 924 3.5878
2.5614 78.0 936 3.5850
2.5614 79.0 948 3.5936
2.5614 80.0 960 3.6018
2.5614 81.0 972 3.6096
2.5614 82.0 984 3.6155
2.5614 83.0 996 3.6183
1.4096 84.0 1008 3.6267
1.4096 85.0 1020 3.6292
1.4096 86.0 1032 3.6350
1.4096 87.0 1044 3.6347
1.4096 88.0 1056 3.6314
1.4096 89.0 1068 3.6300
1.4096 90.0 1080 3.6333
1.4096 91.0 1092 3.6452
1.4096 92.0 1104 3.6503
1.4096 93.0 1116 3.6501
1.4096 94.0 1128 3.6398
1.4096 95.0 1140 3.6374
1.4096 96.0 1152 3.6402
1.4096 97.0 1164 3.6443
1.4096 98.0 1176 3.6472
1.4096 99.0 1188 3.6479
1.4096 100.0 1200 3.6475

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.