Roaoch's picture
Update README.md
a6cdb54 verified
metadata
license: mit
language:
  - ru
metrics:
  - perplexity
  - bleu
  - rouge
library_name: transformers
pipeline_tag: text-generation

This text generator is based on OpenAI GPT2 model from HuggingFace Base model went through two step of learning

First - Finetining of base model

On this step model is finetuned on dataset of single sentence from the texts of Dostovesky F.M.

Training parameters:

  • Epoch = 10
  • Learning Rate = 1e-3
  • Optimizer = AdamW
  • Scheduler = OneCycleLR
  • Training env = PyTorch

image.png

image.png

Second - RL

On this step finetuned model went trough reinforcement learning pipline with TRL library.

Training parameters:

  • Epoch = 30
  • Trainer = PPO
  • Query texts = first 100 texts from dataset, trimmed by first 3 words
  • Reward = score from binary classifier multiplied by 10

image.png

image.png