Edit model card


Why write the rest of your email when you can generate it?

from transformers import pipeline

model_tag = "postbot/distilgpt2-emailgen"
generator = pipeline(
prompt = """

Following up on the bubblegum shipment."""

result = generator(
) # generate

For this model, formatting matters. The results may be (significantly) different between the structure outlined above and prompt = "Hey, just wanted to ..." etc.

Model description

This model is a fine-tuned version of distilgpt2 on a dataset of 50k emails, including the classic aeslc dataset.

It achieves the following results on the evaluation set:

  • Loss: 2.6247

Intended uses & limitations

The intended use of this model is to provide suggestions to "autocomplete" the rest of your email. Said another way, it should serve as a tool to write predictable emails faster. It is not intended to write entire emails; at least some input is required to guide the direction of the model.

Please verify any suggestions by the model for A) False claims and B) negation statements before accepting/sending something.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
2.8299 1.0 248 2.7971
2.6984 2.0 496 2.6826
2.7022 3.0 744 2.6361
2.6436 4.0 992 2.6245
2.6195 5.0 1240 2.6247

Framework versions

  • Transformers 4.21.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
Hosted inference API
Text Generation
This model can be loaded on the Inference API on-demand.

Dataset used to train postbot/distilgpt2-emailgen

Spaces using postbot/distilgpt2-emailgen 2