RylanSchaeffer's picture
End of training
8acaffb verified
metadata
license: apache-2.0
base_model: RylanSchaeffer/EleutherAI_pythia-70m_tatsu-lab_alpaca_farm_sftseed1
tags:
  - trl
  - reward-trainer
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: >-
      pythia-70m_tatsu-lab_alpaca_farm_sftsd1_policy_pythia-6.9b_gold_pythia-6.9b_rmsd3
    results: []

Visualize in Weights & Biases

pythia-70m_tatsu-lab_alpaca_farm_sftsd1_policy_pythia-6.9b_gold_pythia-6.9b_rmsd3

This model is a fine-tuned version of RylanSchaeffer/EleutherAI_pythia-70m_tatsu-lab_alpaca_farm_sftseed1 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7102
  • Accuracy: 0.6002

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 3
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.025
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0 0 1.0058 0.5213
0.9831 0.0648 100 0.9931 0.5198
0.9073 0.1295 200 0.9274 0.5283
0.834 0.1943 300 0.8757 0.5513
0.8691 0.2591 400 0.8491 0.5517
0.8562 0.3238 500 0.8132 0.5629
0.8401 0.3886 600 0.8004 0.5625
0.7914 0.4534 700 0.7770 0.5732
0.8092 0.5181 800 0.7720 0.5782
0.7245 0.5829 900 0.7720 0.5748
0.7888 0.6477 1000 0.7561 0.5917
0.7504 0.7124 1100 0.7483 0.5848
0.704 0.7772 1200 0.7477 0.5909
0.7506 0.8420 1300 0.7451 0.5928
0.7419 0.9067 1400 0.7419 0.5963
0.7339 0.9715 1500 0.7406 0.5940
0.7136 1.0363 1600 0.7366 0.5971
0.7079 1.1010 1700 0.7346 0.5967
0.7363 1.1658 1800 0.7325 0.5955
0.7446 1.2306 1900 0.7323 0.6036
0.7255 1.2953 2000 0.7222 0.6059
0.6924 1.3601 2100 0.7272 0.5944
0.7418 1.4249 2200 0.7243 0.5944
0.7071 1.4896 2300 0.7230 0.5948
0.6902 1.5544 2400 0.7167 0.6036
0.6993 1.6192 2500 0.7206 0.5994
0.7462 1.6839 2600 0.7179 0.6025
0.7 1.7487 2700 0.7154 0.5971
0.7169 1.8135 2800 0.7141 0.6005
0.72 1.8782 2900 0.7178 0.6059
0.6965 1.9430 3000 0.7128 0.6017
0.6311 2.0078 3100 0.7096 0.6071
0.7176 2.0725 3200 0.7143 0.6009
0.7022 2.1373 3300 0.7069 0.6009
0.7022 2.2021 3400 0.7124 0.5975
0.7996 2.2668 3500 0.7203 0.6009
0.7277 2.3316 3600 0.7174 0.5975
0.7638 2.3964 3700 0.7105 0.6017
0.732 2.4611 3800 0.7128 0.6059
0.7315 2.5259 3900 0.7116 0.5959
0.6882 2.5907 4000 0.7131 0.6017
0.7406 2.6554 4100 0.7051 0.6036
0.7353 2.7202 4200 0.7105 0.6005
0.7109 2.7850 4300 0.7036 0.5994
0.713 2.8497 4400 0.7100 0.6032
0.7249 2.9145 4500 0.7145 0.5990
0.7191 2.9793 4600 0.7103 0.6021
0.72 3.0440 4700 0.7112 0.6078
0.6937 3.1088 4800 0.7139 0.6036
0.6832 3.1736 4900 0.7140 0.6044
0.77 3.2383 5000 0.7155 0.5990
0.674 3.3031 5100 0.7114 0.6086
0.7001 3.3679 5200 0.7099 0.6032
0.6587 3.4326 5300 0.7113 0.6017
0.7006 3.4974 5400 0.7077 0.6048
0.7317 3.5622 5500 0.7086 0.6132
0.6984 3.6269 5600 0.7130 0.6009
0.6938 3.6917 5700 0.7105 0.6040
0.6601 3.7565 5800 0.7131 0.6013
0.7239 3.8212 5900 0.7116 0.5978
0.7199 3.8860 6000 0.7091 0.6036
0.7523 3.9508 6100 0.7097 0.6009
0.7043 4.0155 6200 0.7110 0.6071
0.6879 4.0803 6300 0.7074 0.6017
0.7138 4.1451 6400 0.7072 0.6090
0.6976 4.2098 6500 0.7057 0.6067
0.7434 4.2746 6600 0.7069 0.6028
0.7492 4.3394 6700 0.7089 0.6052
0.6268 4.4041 6800 0.7105 0.6002
0.7092 4.4689 6900 0.7080 0.6044
0.6915 4.5337 7000 0.7099 0.6013
0.652 4.5984 7100 0.7120 0.6067
0.7358 4.6632 7200 0.7108 0.5994
0.7935 4.7280 7300 0.7082 0.6013
0.6902 4.7927 7400 0.7069 0.6040
0.7113 4.8575 7500 0.7131 0.6009
0.6529 4.9223 7600 0.7098 0.6036
0.7117 4.9870 7700 0.7095 0.5986

Framework versions

  • Transformers 4.43.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1