arampacha's picture
update model card README.md
2c6bc34
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: gpt-neo-therapist-small
    results: []

gpt-neo-therapist-small

This model is a fine-tuned version of EleutherAI/gpt-neo-125M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.6731
  • Rouge1: 39.5028
  • Rouge2: 6.43
  • Rougel: 24.0091
  • Rougelsum: 35.4481
  • Gen Len: 204.1329

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 24
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
9.9955 0.97 7 6.8195 18.6047 1.0194 14.8565 17.9774 212.0983
6.9729 1.97 14 5.6783 26.3789 3.0779 18.5195 24.8592 203.0925
5.2614 2.97 21 5.0506 34.9428 4.921 21.9741 32.1122 206.2775
5.0599 3.97 28 4.7372 38.5235 6.2251 23.5923 34.5633 204.2428
4.5479 4.97 35 4.6731 39.5028 6.43 24.0091 35.4481 204.1329

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.0.0
  • Tokenizers 0.11.6