SetFit documentation

Multilabel Text Classification

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Multilabel Text Classification

SetFit supports multilabel classification, allowing multiple labels to be assigned to each instance.

Unless each instance must be assigned multiple outputs, you frequently do not need to specify a multi target strategy.

This guide will show you how to train and use multilabel SetFit models.

Multilabel strategies

SetFit will initialise a multilabel classification head from sklearn - the following options are available for multi_target_strategy:

See the scikit-learn documentation for multiclass and multioutput classification for more details.

Initializing SetFit models with multilabel strategies

Using the default LogisticRegression head, we can apply multi target strategies like so:

from setfit import SetFitModel

model = SetFitModel.from_pretrained(
    model_id, # e.g. "BAAI/bge-small-en-v1.5"
    multi_target_strategy="multi-output",
)

With a differentiable head it looks like so:

from setfit import SetFitModel

model = SetFitModel.from_pretrained(
    model_id, # e.g. "BAAI/bge-small-en-v1.5"
    multi_target_strategy="one-vs-rest"
    use_differentiable_head=True,
    head_params={"out_features": num_classes},
)
< > Update on GitHub