IrwinD's picture
End of training
6e34384 verified
|
raw
history blame
4.29 kB
metadata
license: apache-2.0
base_model: distilbert/distilbert-base-uncased
tags:
  - trl
  - reward-trainer
  - generated_from_trainer
datasets:
  - hdfs_rlhf_log_summary_dataset
metrics:
  - accuracy
model-index:
  - name: log_sage_reward_model
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: hdfs_rlhf_log_summary_dataset
          type: hdfs_rlhf_log_summary_dataset
          config: default
          split: None
          args: default
        metrics:
          - name: Accuracy
            type: accuracy
            value: 1

log_sage_reward_model

This model is a fine-tuned version of distilbert/distilbert-base-uncased on the hdfs_rlhf_log_summary_dataset dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1669
  • Accuracy: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.41e-05
  • train_batch_size: 6
  • eval_batch_size: 24
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 96
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 1 0.6950 0.5
No log 2.0 2 0.6896 1.0
No log 3.0 3 0.6843 1.0
No log 4.0 4 0.6789 1.0
No log 5.0 5 0.6735 1.0
No log 6.0 6 0.6671 1.0
No log 7.0 7 0.6597 1.0
No log 8.0 8 0.6510 1.0
No log 9.0 9 0.6403 1.0
0.0839 10.0 10 0.6275 1.0
0.0839 11.0 11 0.6130 1.0
0.0839 12.0 12 0.5955 1.0
0.0839 13.0 13 0.5747 1.0
0.0839 14.0 14 0.5508 1.0
0.0839 15.0 15 0.5250 1.0
0.0839 16.0 16 0.4984 1.0
0.0839 17.0 17 0.4698 1.0
0.0839 18.0 18 0.4413 1.0
0.0839 19.0 19 0.4121 1.0
0.0658 20.0 20 0.3850 1.0
0.0658 21.0 21 0.3604 1.0
0.0658 22.0 22 0.3384 1.0
0.0658 23.0 23 0.3186 1.0
0.0658 24.0 24 0.2995 1.0
0.0658 25.0 25 0.2823 1.0
0.0658 26.0 26 0.2664 1.0
0.0658 27.0 27 0.2516 1.0
0.0658 28.0 28 0.2384 1.0
0.0658 29.0 29 0.2260 1.0
0.0346 30.0 30 0.2149 1.0
0.0346 31.0 31 0.2054 1.0
0.0346 32.0 32 0.1971 1.0
0.0346 33.0 33 0.1898 1.0
0.0346 34.0 34 0.1838 1.0
0.0346 35.0 35 0.1787 1.0
0.0346 36.0 36 0.1746 1.0
0.0346 37.0 37 0.1714 1.0
0.0346 38.0 38 0.1691 1.0
0.0346 39.0 39 0.1676 1.0
0.021 40.0 40 0.1669 1.0

Framework versions

  • Transformers 4.39.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2