Edit model card

opt-history-gen2

This model is a fine-tuned version of facebook/opt-350m on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8025

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.1245 0.1693 100 3.0013
2.9473 0.3386 200 2.9180
2.9014 0.5078 300 2.8787
2.873 0.6771 400 2.8646
2.8631 0.8464 500 2.8631
2.8366 1.0157 600 2.8451
2.64 1.1849 700 2.8380
2.6295 1.3542 800 2.8202
2.6272 1.5235 900 2.7987
2.6327 1.6928 1000 2.7971
2.626 1.8620 1100 2.7739
2.5237 2.0313 1200 2.7829
2.3242 2.2006 1300 2.7812
2.319 2.3699 1400 2.7727
2.3314 2.5391 1500 2.7668
2.3579 2.7084 1600 2.7561
2.307 2.8777 1700 2.7586
2.2612 3.0470 1800 2.7795
2.056 3.2163 1900 2.7801
2.0802 3.3855 2000 2.7670
2.1104 3.5548 2100 2.7708
2.1115 3.7241 2200 2.7629
2.0828 3.8934 2300 2.7606
1.996 4.0626 2400 2.7849
1.8701 4.2319 2500 2.7938
1.92 4.4012 2600 2.7928
1.8844 4.5705 2700 2.7846
1.9058 4.7397 2800 2.7840
1.901 4.9090 2900 2.7821
1.8418 5.0783 3000 2.8017
1.757 5.2476 3100 2.8055
1.7503 5.4168 3200 2.8095
1.7606 5.5861 3300 2.8065
1.7316 5.7554 3400 2.8043
1.7632 5.9247 3500 2.8025

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.3.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
11
Safetensors
Model size
331M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for ambrosfitz/opt-history-v2

Base model

facebook/opt-350m
Finetuned
(106)
this model
Finetunes
1 model

Datasets used to train ambrosfitz/opt-history-v2