Large language models (LLMs), such as OpenAI's ChatGPT and similar chatbot products from other organizations, have recently gained widespread adoption. These models can extend text or respond to instructions in a natural and helpful manner. Despite the core technologies behind LLMs, namely the transformer architecture and the GPT decoder-only causal language model, remaining relatively unchanged for over five years, the surge in popularity of ChatGPT can be largely attributed to recent approaches that better align the output of LLMs with users' and service providers' intentions.
- Supervised finetuning (SFT) on natural instructions
- Reinforcement learning from human feedback (RLHF)
- Utilizes a large number of pretraining examples tagged with human-understandable classifiers
- Leverages content tagging found in many online environments
- Examples of commonly used tags:
- Suitable for work (SFW) and not suitable for work (NSFW)
- G, PG, PG-13, and R for television and movie content
- Traditional pretraining involves predicting the subsequent word in minimally processed text.
- Conditional pretraining prepends training examples with descriptive tags and a brief synopsis.
- Current LLMs have proprietary instructions and reward models, which can hinder public review and discussions on sensitive topics.
- Conditional pretraining tags are transparent and easily understood by auditors or end users.
An example output from this conditional tagging model for a recent news article about LAION.
Article Here is below. To generate these document tags only text from the body of the article was used.
[ artificial intelligence, open source, ai, open letter, open source ai, ai research] # This article explains the importance of a CERN-like organization to coordinate efforts on the transparency of large-scale AI research and provides information about LAION.
Format your inputs like this:
[ tag1, tag2, tag3, tag_n] # This is a short synopsis of what kind of text I want to generate.
Thank you to LAION and Stability.ai for support and compute resources to experiment with conditional pretraining.
- Conditional pretraining helps the user control the outputs of the model.
- However, these models (and all language models) can still generate undesirable content.
- So please enjoy and use with care!
- Downloads last month