Edit model card

distilgpt2-email-generation

Why write the rest of your email when you can generate it?

from transformers import pipeline

model_tag = "pszemraj/distilgpt2-email-generation"
generator = pipeline(
              'text-generation', 
              model=model_tag, 
              use_fast=False,
              do_sample=False,
              early_stopping=True,
            )
            
prompt = """
Hello, 

Following up on the bubblegum shipment."""

generator(
    prompt,
    max_length=64,
) # generate
  • A script to use this on CPU/command line can be found here :)
  • A more performant (but slightly more compute intensive) version is also available: gpt-medium

For this model, formatting matters. The results may be (significantly) different between the structure outlined above and prompt = "Hey, just wanted to ..." etc.

Model description

This model is a fine-tuned version of distilgpt2 on the aeslc dataset.

It achieves the following results on the evaluation set:

  • Loss: 2.8176

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 6

Training results

Training Loss Epoch Step Validation Loss
3.1589 1.0 129 3.0496
2.9914 2.0 258 2.9224
2.8058 3.0 387 2.8449
2.6141 4.0 516 2.8214
2.6337 5.0 645 2.8109
2.5428 6.0 774 2.8176

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.11.0+cu113
  • Tokenizers 0.12.1
Downloads last month
6
Hosted inference API
Text Generation
Examples
Examples
This model can be loaded on the Inference API on-demand.

Dataset used to train pszemraj/distilgpt2-email-generation