MiniProject_Prescription_Chatbot

This model is a fine-tuned version of distilbert/distilgpt2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.6475

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	12	3.8781
No log	2.0	24	3.7741
No log	3.0	36	3.6911
No log	4.0	48	3.6233
No log	5.0	60	3.5601
No log	6.0	72	3.5104
No log	7.0	84	3.4804
No log	8.0	96	3.4457
No log	9.0	108	3.4133
No log	10.0	120	3.4018
No log	11.0	132	3.3834
No log	12.0	144	3.3487
No log	13.0	156	3.3486
No log	14.0	168	3.3230
No log	15.0	180	3.3198
No log	16.0	192	3.2984
No log	17.0	204	3.3169
No log	18.0	216	3.2786
No log	19.0	228	3.3034
No log	20.0	240	3.2695
No log	21.0	252	3.2597
No log	22.0	264	3.2644
No log	23.0	276	3.2610
No log	24.0	288	3.2862
No log	25.0	300	3.2750
No log	26.0	312	3.2505
No log	27.0	324	3.2844
No log	28.0	336	3.2729
No log	29.0	348	3.2894
No log	30.0	360	3.2875
No log	31.0	372	3.2735
No log	32.0	384	3.2998
No log	33.0	396	3.3070
No log	34.0	408	3.2893
No log	35.0	420	3.2935
No log	36.0	432	3.3057
No log	37.0	444	3.3028
No log	38.0	456	3.3239
No log	39.0	468	3.3158
No log	40.0	480	3.3249
No log	41.0	492	3.3595
2.5614	42.0	504	3.3610
2.5614	43.0	516	3.3546
2.5614	44.0	528	3.3815
2.5614	45.0	540	3.3620
2.5614	46.0	552	3.3823
2.5614	47.0	564	3.3800
2.5614	48.0	576	3.4000
2.5614	49.0	588	3.4191
2.5614	50.0	600	3.4093
2.5614	51.0	612	3.4162
2.5614	52.0	624	3.4197
2.5614	53.0	636	3.4370
2.5614	54.0	648	3.4442
2.5614	55.0	660	3.4767
2.5614	56.0	672	3.4642
2.5614	57.0	684	3.4780
2.5614	58.0	696	3.4808
2.5614	59.0	708	3.4712
2.5614	60.0	720	3.5279
2.5614	61.0	732	3.4993
2.5614	62.0	744	3.4865
2.5614	63.0	756	3.5209
2.5614	64.0	768	3.5196
2.5614	65.0	780	3.5359
2.5614	66.0	792	3.5089
2.5614	67.0	804	3.5489
2.5614	68.0	816	3.5528
2.5614	69.0	828	3.5587
2.5614	70.0	840	3.5606
2.5614	71.0	852	3.5719
2.5614	72.0	864	3.5776
2.5614	73.0	876	3.5700
2.5614	74.0	888	3.5825
2.5614	75.0	900	3.5779
2.5614	76.0	912	3.5934
2.5614	77.0	924	3.5878
2.5614	78.0	936	3.5850
2.5614	79.0	948	3.5936
2.5614	80.0	960	3.6018
2.5614	81.0	972	3.6096
2.5614	82.0	984	3.6155
2.5614	83.0	996	3.6183
1.4096	84.0	1008	3.6267
1.4096	85.0	1020	3.6292
1.4096	86.0	1032	3.6350
1.4096	87.0	1044	3.6347
1.4096	88.0	1056	3.6314
1.4096	89.0	1068	3.6300
1.4096	90.0	1080	3.6333
1.4096	91.0	1092	3.6452
1.4096	92.0	1104	3.6503
1.4096	93.0	1116	3.6501
1.4096	94.0	1128	3.6398
1.4096	95.0	1140	3.6374
1.4096	96.0	1152	3.6402
1.4096	97.0	1164	3.6443
1.4096	98.0	1176	3.6472
1.4096	99.0	1188	3.6479
1.4096	100.0	1200	3.6475

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

tzartrooper
/

MiniProject_Prescription_Chatbot

MiniProject_Prescription_Chatbot

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results