Edit model card

pythia-1.4B-finetuned-oa-instructions

This model is a fine-tuned version of pythia on the oa dataset. It achieves the following results on the evaluation set:

Loss: 0.1224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • seed: 42

  • learning_rate: 5e-06

  • train_batch_size: 32

  • eval_batch_size: 8

  • optimizer: Adam with betas : {'lr': 5e-06, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.0}

  • lr_scheduler_type: linear

  • training_steps: 5000

  • fp16

  • warmup_steps 5

  • Num examples = 53k

Training results

{
    "epoch": 1.0,
    "train_loss": 0.8031303182039198,
    "train_runtime": 6338.6403,
    "train_samples": 53455,
    "train_samples_per_second": 8.433,
    "train_steps_per_second": 0.264
}

Framework versions

  • transformers 4.24.0
  • torch 1.10.0+cu111
  • datasets 2.10.0
  • tokenizers 0.12.1
Downloads last month
3

Space using zirui3/gpt_1.4B_oa_instruct 1