{{#eq model "ctrl"}}
{{/eq}} {{#eq model "pplm"}}
{{/eq}}
See all models and checkpoints
{{#eq model "arxiv-nlp"}} 🤓 ArXiv NLP model checkpoint {{/eq}} {{#eq model "distil-gpt2"}} 🐎 DistilGPT-2 model checkpoint {{/eq}} {{#eq model "ctrl"}} ☁️ Salesforce Research CTRL {{/eq}} {{#eq model "pplm"}} 🚕 Uber AI Plug and Play Language Model (PPLM) {{/eq}}
Star
{{#eq model "distil-gpt2"}}

The student of the now ubiquitous GPT-2 does not come short of its teacher’s expectations. Obtained by distillation, DistilGPT-2 weighs 37% less, and is twice as fast as its OpenAI counterpart, while keeping the same generative power. Runs smoothly on an iPhone 7. The dawn of lightweight generative transformers? 🤯

From the paper: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter by Victor Sanh, Lysandre Debut, Julien Chaumond and Thomas Wolf. The same method was applied to distill GPT-2, and a Medium blogpost describes the process in detail.

{{/eq}} {{#eq model "arxiv-nlp"}}

Built on the OpenAI GPT-2 model, the Hugging Face team has fine-tuned the small version of the model on a tiny dataset (60MB of text) of Arxiv papers. The targeted subject is Natural Language Processing, resulting in a very Linguistics/Deep Learning oriented generation.

All articles were downloaded from Cornell University’s arxiv.org website using arXiv Bulk Data Access.

{{/eq}} {{#eq model "ctrl"}}

CTRL transcends the pre-training/fine-tuning approach by taking advantage of a whopping 1.6 billion parameters 🤯.

Controllable Generation: this model generates some text directly tuned to several subreddits (fitness, personal finance, running and many more), Wikipedia articles or product reviews. Take advantage of its control codes and use it for question answering, translation or styled text generation. Kindly implemented by the Salesforce team in 🤗/transformers.

From the paper CTRL: A Conditional Transformer Language Model for Controllable Generation by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher.

{{/eq}} {{#eq model "pplm"}}

PPLM builds on top of other large transformer-based generative models (like GPT-2), where it enables finer-grained control of attributes of the generated language (e.g. gradually switching topic 🐱 or sentiment 😃).

This controlled language generation method consists of plugging in simple bag-of-words or one-layer classifiers as attribute controllers, and making updates in the activation space, without changing any model parameters. Kindly implemented by the Uber AI team in 🤗/transformers.

From the paper Plug and Play Language Model: A simple baseline for controlled language generation by Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu.

{{/eq}}
Start writing