A generative artificial intelligence or generative AI is a type of AI system capable of generating text, images, or other media in response to prompts. Generative AI systems use generative models such as large language models to statistically sample new data based on the training data set that was used to create them.

Notable generative AI systems include ChatGPT, a chatbot built by OpenAI using the GPT-3 and GPT-4 large language models and Bard, a chatbot built by Google using the LaMDA model. Other generative AI models include artificial intelligence art systems such as Stable Diffusion, Midjourney, and DALL-E.

Generative AI has potential applications across a wide range of industries, including software development, marketing, and fashion. Investment in generative AI surged during the early 2020s, with large companies such as Microsoft, Google, and Baidu as well as numerous smaller firms developing generative AI models.

Modalities
A detailed oil painting of figures in a futuristic opera scene
Théâtre d'Opéra Spatial, an image generated by Midjourney
A generative AI system is constructed by applying unsupervised or self-supervised machine learning to a data set. The capabilities of a generative AI system depend on the modality or type of the data set used.

Text: Generative AI systems trained on words or word tokens include GPT-3, LaMDA, LLaMA, BLOOM, GPT-4, and others (see List of large language models). They are capable of natural language processing, machine translation, and natural language generation and can be used as foundation models for other tasks. Data sets include BookCorpus, Wikipedia, and others (see List of text corpora).
Code: In addition to natural language text, large language models can be trained on programming language text, allowing them to generate source code for new computer programs. Examples include OpenAI Codex.
Images: Generative AI systems trained on sets of images with text captions include such as Imagen, DALL-E, Midjourney, Stable Diffusion and others (see Artificial intelligence art, Generative art, Synthetic media). They are commonly used for text-to-image generation and neural style transfer. Datasets include LAION-5B and others (See Datasets in computer vision).
Molecules: Generative AI systems can be trained on sequences of amino acids or molecular representations such as SMILES representing DNA or proteins. These systems, such as AlphaFold, are used for protein structure prediction and drug discovery. Datasets include various biological datasets.
Music: Generative AI systems such as MusicLM can be trained on the audio waveforms of recorded music along with text annotations, in order to generate new musical samples based on text descriptions such as "a calming violin melody backed by a distorted guitar riff".
Video: Generative AI trained on annotated video can generate temporally-coherent video clips. Examples include Gen1 by RunwayML and Make-A-Video by Meta Platforms.
Multimodal: A generative AI system can be built from multiple generative models, or one model trained on multiple types of data. For example, one version of OpenAI's GPT-4 accepts both text and image inputs.
What text data set examples of Generative artificial intelligence
text data set examples of Generative artificial intelligence Data sets include BookCorpus, Wikipedia