PEFT documentation

PEFT integrations

You are viewing v0.12.0 version. A newer version v0.14.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

PEFT integrations

PEFT’s practical benefits extends to other Hugging Face libraries like Diffusers and Transformers. One of the main benefits of PEFT is that an adapter file generated by a PEFT method is a lot smaller than the original model, which makes it super easy to manage and use multiple adapters. You can use one pretrained base model for multiple tasks by simply loading a new adapter finetuned for the task you’re solving. Or you can combine multiple adapters with a text-to-image diffusion model to create new effects.

This tutorial will show you how PEFT can help you manage adapters in Diffusers and Transformers.

Diffusers

Diffusers is a generative AI library for creating images and videos from text or images with diffusion models. LoRA is an especially popular training method for diffusion models because you can very quickly train and share diffusion models to generate images in new styles. To make it easier to use and try multiple LoRA models, Diffusers uses the PEFT library to help manage different adapters for inference.

For example, load a base model and then load the artificialguybr/3DRedmond-V1 adapter for inference with the load_lora_weights method. The adapter_name argument in the loading method is enabled by PEFT and allows you to set a name for the adapter so it is easier to reference.

import torch
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
).to("cuda")
pipeline.load_lora_weights(
    "peft-internal-testing/artificialguybr__3DRedmond-V1", 
    weight_name="3DRedmond-3DRenderStyle-3DRenderAF.safetensors", 
    adapter_name="3d"
)
image = pipeline("sushi rolls shaped like kawaii cat faces").images[0]
image

Now let’s try another cool LoRA model, ostris/super-cereal-sdxl-lora. All you need to do is load and name this new adapter with adapter_name, and use the set_adapters method to set it as the currently active adapter.

pipeline.load_lora_weights(
    "ostris/super-cereal-sdxl-lora", 
    weight_name="cereal_box_sdxl_v1.safetensors", 
    adapter_name="cereal"
)
pipeline.set_adapters("cereal")
image = pipeline("sushi rolls shaped like kawaii cat faces").images[0]
image

Finally, you can call the disable_lora method to restore the base model.

pipeline.disable_lora()

Learn more about how PEFT supports Diffusers in the Inference with PEFT tutorial.

Transformers

🤗 Transformers is a collection of pretrained models for all types of tasks in all modalities. You can load these models for training or inference. Many of the models are large language models (LLMs), so it makes sense to integrate PEFT with Transformers to manage and train adapters.

Load a base pretrained model to train.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m")

Next, add an adapter configuration to specify how to adapt the model parameters. Call the add_adapter() method to add the configuration to the base model.

from peft import LoraConfig

peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM"
)
model.add_adapter(peft_config)

Now you can train the model with Transformer’s Trainer class or whichever training framework you prefer.

To use the newly trained model for inference, the AutoModel class uses PEFT on the backend to load the adapter weights and configuration file into a base pretrained model.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("peft-internal-testing/opt-350m-lora")

Alternatively, you can use transformers Pipelines to load the model for conveniently running inference:

from transformers import pipeline

model = pipeline("text-generation", "peft-internal-testing/opt-350m-lora")
print(model("Hello World"))

If you’re interested in comparing or using more than one adapter, you can call the add_adapter() method to add the adapter configuration to the base model. The only requirement is the adapter type must be the same (you can’t mix a LoRA and LoHa adapter).

from transformers import AutoModelForCausalLM
from peft import LoraConfig

model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m")
model.add_adapter(lora_config_1, adapter_name="adapter_1")

Call add_adapter() again to attach a new adapter to the base model.

model.add_adapter(lora_config_2, adapter_name="adapter_2")

Then you can use set_adapter() to set the currently active adapter.

model.set_adapter("adapter_1")
output = model.generate(**inputs)
print(tokenizer.decode(output_disabled[0], skip_special_tokens=True))

To disable the adapter, call the disable_adapters method.

model.disable_adapters()

The enable_adapters can be used to enable the adapters again.

If you’re curious, check out the Load and train adapters with PEFT tutorial to learn more.

< > Update on GitHub