pszemraj's picture
update code for inference
280daa0
|
raw
history blame
3.46 kB
metadata
license: other
tags:
  - generated_from_trainer
  - opt
  - custom-license
  - no-commercial
  - email
  - auto-complete
datasets:
  - aeslc
widget:
  - text: >-
      Hey <NAME>,


      Thank you for signing up for my weekly newsletter. Before we get started,
      you'll have to confirm your email address.
    example_title: newsletter
  - text: >-
      Hi <NAME>,


      I hope this email finds you well. Let me start by saying that I am a big
      fan of your work.
    example_title: fan
  - text: >-
      Greetings <NAME>,


      I hope you had a splendid evening at the Company sausage eating festival.
      I am reaching out because
    example_title: festival
  - text: |-
      Good Morning <NAME>,

      I was just thinking to myself about how much I love creating value
    example_title: value
  - text: URGENT - I need
    example_title: URGENT
inference:
  parameters:
    min_length: 4
    max_length: 64
    length_penalty: 0.7
    no_repeat_ngram_size: 3
    do_sample: false
    num_beams: 4
    early_stopping: true
    repetition_penalty: 3.5

NOTE: there is currently a bug with huggingface API for OPT models. Please use the colab notebook to test :)

opt for email generation - 350M

Why write the rest of your email when you can generate it?

from transformers import pipeline

model_tag = "pszemraj/opt-350m-email-generation"
generator = pipeline(
              'text-generation', 
              model=model_tag, 
              use_fast=False,
              do_sample=False,
              early_stopping=True,
            )
            
prompt = """
Hello, 

Following up on the bubblegum shipment."""

generator(
    prompt,
    max_length=64,
) # generate
  • Link to notebook on Colab

    For this model, formatting matters. The results may be (significantly) different between the structure outlined above and prompt = "Hey, just wanted to ..." etc.

Model description

  • This model is a fine-tuned version of facebook/opt-350m on the aeslc dataset for six epochs.
  • Emails, phone numbers, etc., were attempted to be excluded in a dataset preparation step using clean-text in Python.
  • Note that API is restricted to generating 64 tokens - you can generate longer emails by using this in a text-generation pipeline object

Intended uses & limitations

  • in their everlasting wisdom, Facebook/Meta has decided to make a custom license for this, specifying several things. See facebook/opt-350m for details.

Training and evaluation data

  • the email_body field of train + validation (get more data) from the aeslc dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 6

Framework versions

  • Transformers 4.19.2
  • Pytorch 1.11.0+cu113
  • Tokenizers 0.12.1