API Reference

Popular tasks

Generate a response given a list of messages in a conversational context.

Converting a text into a vector often called "embedding".

Generate an image based on a given text prompt.

Generate an video based on a given text prompt.

Audio classification is the task of assigning a label or class to a given audio.

Automatic Speech Recognition (ASR), also known as Speech to Text (STT), is the task of transcribing a given audio to text.

Mask filling is the task of predicting the right word (token to be precise) in the middle of a sequence.

Image classification is the task of assigning a label or class to an entire image. Images are expected to have only one class for each image.

Image Segmentation divides an image into segments where each pixel in the image is mapped to an object.

Image-to-image is the task of transforming a source image to match the characteristics of a target image or a target image domain.

Object Detection models allow users to identify objects of certain defined classes.

Question Answering models can retrieve the answer to a question from a given text, which is useful for searching for an answer in a document.

Summarization is the task of producing a shorter version of a document while preserving its important information.

Table Question Answering (Table QA) is the answering a question about an information on a given table.

Text Classification is the task of assigning a label or class to a given text.

Generate text based on a prompt.

Token classification is a task in which a label is assigned to some tokens in a text.

Translation is the task of converting text from one language to another.

Zero shot classification is the task to classify text without specific training for the task.