Edit model card

dapt_plus_tapt_helpfulness_base_pretraining_model

This model is a fine-tuned version of BigTMiami/amazon_pretraining_5M_model_corrected on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 21
  • eval_batch_size: 21
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 42
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
1.6784 1.0 232 1.5304
1.6014 2.0 465 1.5186
1.5847 3.0 697 1.5100
1.5492 4.0 930 1.4959
1.5369 5.0 1162 1.5022
1.5086 6.0 1395 1.4919
1.4953 7.0 1627 1.4770
1.4729 8.0 1860 1.4840
1.4612 9.0 2092 1.4719
1.4502 10.0 2325 1.4595
1.436 11.0 2557 1.4670
1.4178 12.0 2790 1.4709
1.4055 13.0 3022 1.4514
1.3951 14.0 3255 1.4595
1.3848 15.0 3487 1.4585
1.3678 16.0 3720 1.4752
1.3659 17.0 3952 1.4636
1.3523 18.0 4185 1.4515
1.3443 19.0 4417 1.4609
1.3285 20.0 4650 1.4590
1.3283 21.0 4882 1.4595
1.3109 22.0 5115 1.4490
1.3111 23.0 5347 1.4457
1.2964 24.0 5580 1.4543
1.2945 25.0 5812 1.4500
1.2792 26.0 6045 1.4537
1.2741 27.0 6277 1.4428
1.2603 28.0 6510 1.4508
1.2609 29.0 6742 1.4473
1.246 30.0 6975 1.4458
1.2436 31.0 7207 1.4473
1.2324 32.0 7440 1.4384
1.2282 33.0 7672 1.4368
1.2164 34.0 7905 1.4466
1.2146 35.0 8137 1.4460
1.2022 36.0 8370 1.4520
1.1991 37.0 8602 1.4509
1.191 38.0 8835 1.4412
1.1909 39.0 9067 1.4449
1.1777 40.0 9300 1.4521
1.1762 41.0 9532 1.4582
1.166 42.0 9765 1.4403
1.1618 43.0 9997 1.4484

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
15
Safetensors
Model size
125M params
Tensor type
F32
·

Finetuned from