Edit model card
YAML Metadata Error: "datasets[0]" must be a string

topicalchat-multiturn

This model is a fine-tuned version of microsoft/DialoGPT-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5260

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 73 4.2992
No log 2.0 146 3.4433
No log 3.0 219 3.1606
No log 4.0 292 3.0366
No log 5.0 365 2.9679
No log 6.0 438 2.9131
4.1401 7.0 511 2.8752
4.1401 8.0 584 2.8391
4.1401 9.0 657 2.8118
4.1401 10.0 730 2.7871
4.1401 11.0 803 2.7659
4.1401 12.0 876 2.7489
4.1401 13.0 949 2.7331
2.9768 14.0 1022 2.7196
2.9768 15.0 1095 2.7071
2.9768 16.0 1168 2.6940
2.9768 17.0 1241 2.6854
2.9768 18.0 1314 2.6728
2.9768 19.0 1387 2.6647
2.9768 20.0 1460 2.6562
2.7864 21.0 1533 2.6482
2.7864 22.0 1606 2.6439
2.7864 23.0 1679 2.6326
2.7864 24.0 1752 2.6107
2.7864 25.0 1825 2.6043
2.7864 26.0 1898 2.5970
2.7864 27.0 1971 2.5908
2.6568 28.0 2044 2.5862
2.6568 29.0 2117 2.5828
2.6568 30.0 2190 2.5765
2.6568 31.0 2263 2.5742
2.6568 32.0 2336 2.5682
2.6568 33.0 2409 2.5656
2.6568 34.0 2482 2.5614
2.5489 35.0 2555 2.5605
2.5489 36.0 2628 2.5552
2.5489 37.0 2701 2.5541
2.5489 38.0 2774 2.5494
2.5489 39.0 2847 2.5491
2.5489 40.0 2920 2.5455
2.5489 41.0 2993 2.5452
2.475 42.0 3066 2.5433
2.475 43.0 3139 2.5397
2.475 44.0 3212 2.5386
2.475 45.0 3285 2.5400
2.475 46.0 3358 2.5339
2.475 47.0 3431 2.5327
2.4144 48.0 3504 2.5327
2.4144 49.0 3577 2.5312
2.4144 50.0 3650 2.5338
2.4144 51.0 3723 2.5314
2.4144 52.0 3796 2.5309
2.4144 53.0 3869 2.5289
2.4144 54.0 3942 2.5290
2.3642 55.0 4015 2.5270
2.3642 56.0 4088 2.5270
2.3642 57.0 4161 2.5263
2.3642 58.0 4234 2.5267
2.3642 59.0 4307 2.5273
2.3642 60.0 4380 2.5258
2.3642 61.0 4453 2.5253
2.3216 62.0 4526 2.5244
2.3216 63.0 4599 2.5256
2.3216 64.0 4672 2.5227
2.3216 65.0 4745 2.5241
2.3216 66.0 4818 2.5244
2.3216 67.0 4891 2.5236
2.3216 68.0 4964 2.5251
2.2879 69.0 5037 2.5231
2.2879 70.0 5110 2.5254
2.2879 71.0 5183 2.5242
2.2879 72.0 5256 2.5254
2.2879 73.0 5329 2.5253
2.2879 74.0 5402 2.5228
2.2879 75.0 5475 2.5247
2.261 76.0 5548 2.5243
2.261 77.0 5621 2.5247
2.261 78.0 5694 2.5250
2.261 79.0 5767 2.5248
2.261 80.0 5840 2.5236
2.261 81.0 5913 2.5264
2.261 82.0 5986 2.5249
2.2396 83.0 6059 2.5256
2.2396 84.0 6132 2.5267
2.2396 85.0 6205 2.5258
2.2396 86.0 6278 2.5242
2.2396 87.0 6351 2.5233
2.2396 88.0 6424 2.5249
2.2396 89.0 6497 2.5253
2.2238 90.0 6570 2.5252
2.2238 91.0 6643 2.5255
2.2238 92.0 6716 2.5263
2.2238 93.0 6789 2.5261
2.2238 94.0 6862 2.5257
2.2238 95.0 6935 2.5253
2.213 96.0 7008 2.5267
2.213 97.0 7081 2.5258
2.213 98.0 7154 2.5258
2.213 99.0 7227 2.5259
2.213 100.0 7300 2.5260

Framework versions

  • Transformers 4.8.1
  • Pytorch 1.8.1+cu111
  • Datasets 1.8.0
  • Tokenizers 0.10.3
Downloads last month
3
Hosted inference API
Text Generation
Examples
Examples
This model can be loaded on the Inference API on-demand.