zirui3/gpt_1.4B_oa_instruct

pythia-1.4B-finetuned-oa-instructions

This model is a fine-tuned version of pythia on the oa dataset. It achieves the following results on the evaluation set:

Loss: 0.1224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

seed: 42
learning_rate: 5e-06
train_batch_size: 32
eval_batch_size: 8
optimizer: Adam with betas : {'lr': 5e-06, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.0}
lr_scheduler_type: linear
training_steps: 5000
fp16
warmup_steps 5
Num examples = 53k

Training results

{
    "epoch": 1.0,
    "train_loss": 0.8031303182039198,
    "train_runtime": 6338.6403,
    "train_samples": 53455,
    "train_samples_per_second": 8.433,
    "train_steps_per_second": 0.264
}

Framework versions

transformers 4.24.0
torch 1.10.0+cu111
datasets 2.10.0
tokenizers 0.12.1

zirui3
/

gpt_1.4B_oa_instruct

Space using zirui3/gpt_1.4B_oa_instruct 1