PEFT documentation

Trainable Tokens

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.14.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Trainable Tokens

The Trainable Tokens method provides a way to target specific token embeddings for fine-tuning without resorting to training the full embedding matrix or using an adapter on the embedding matrix. It is based on the initial implementation from here.

The method only targets specific tokens and selectively trains the token indices you specify. Consequently the required RAM will be lower and disk memory is also significantly lower than storing the full fine-tuned embedding matrix.

Some preliminary benchmarks acquired with this script suggest that for gemma-2-2b (which has a rather large embedding matrix) you can save 4.8GiB VRAM with Trainable Tokens over fully fine-tuning the embedding matrix. While LoRA will use even less memory (-6.3GiB total over fine-tuning) it might also target tokens you don’t want to be changed. With less extreme embedding matrixes the difference might come out shorter as well.

Note that this method does not add tokens for you, you have to add tokens to the tokenizer yourself and resize the embedding matrix of the model accordingly. This method will only re-train the embeddings for the tokens you specify. This method can also be used in conjunction with LoRA layers! See the LoRA developer guide.

TrainableTokensConfig

class peft.TrainableTokensConfig

< >

( task_type: typing.Union[str, peft.utils.peft_types.TaskType, NoneType] = None peft_type: typing.Union[str, peft.utils.peft_types.PeftType, NoneType] = None auto_mapping: typing.Optional[dict] = None base_model_name_or_path: typing.Optional[str] = None revision: typing.Optional[str] = None inference_mode: bool = False token_indices: list[int] = <factory> target_modules: Optional[Union[list[str], str]] = <factory> init_weights: bool = True )

Parameters

  • token_indices (list[int]) — List of integers, signifying the indices of the tokens you want to be trainable. To find the index of a token with a tokenizer, you can tokenize the string and look at the returned input_ids. The closer the amount of indices is to the total amount of tokens, the less efficient this method gets.
  • target_modules (Optional[Union[list[str], str]]) — List of module names or regex expression of the module names to replace with our TrainableTokensLayer. This is by default the embed_tokens layer. But could be multiple embedding-like layers, such as embedding, encoder.embeddings or decoder.embeddings.
  • init_weights (bool) — By default the new token weights are initialized to be the same as the respective token embeddings. This makes TrainableTokens a no-op when not trained. If set to False the weights will be random values. Do not change this setting unless you know exactly what you’re doing.

Configuration for the TrainableTokens method.

Allows for training new tokens (and re-training existing ones) without training the full embedding matrix. By marking a few select tokens (identified by their indices) trainable and leaving the rest untouched, this method can be used to add new tokens or changing the embedding of existing tokens while saving on memory. Both storage as well as working memory usage are reduced in contrast to training the embedding matrix fully.

Note that training with FSDP/DeepSpeed might not yet be fully supported. Also note that models using weight tying are currently not supported and will raise an error.

TrainableTokensModel

class peft.TrainableTokensModel

< >

( model config adapter_name low_cpu_mem_usage: bool = False )

disable_adapter_layers

< >

( )

Disable all adapters.

When disabling all adapters, the model output corresponds to the output of the base model.

enable_adapter_layers

< >

( )

Enable all adapters.

Call this if you have previously disabled all adapters and want to re-enable them.

merge_and_unload

< >

( progressbar: bool = False safe_merge: bool = False adapter_names: Optional[list[str]] = None )

Parameters

  • progressbar (bool) — whether to show a progressbar indicating the unload and merge process
  • safe_merge (bool) — whether to activate the safe merging check to check if there is any potential Nan in the adapter weights
  • adapter_names (List[str], optional) — The list of adapter names that should be merged. If None, all active adapters will be merged. Defaults to None.

This method merges the trained tokens into the targeted embedding layer(s) of the base model. This is needed if someone wants to use the base model as a standalone model.

set_adapter

< >

( adapter_name: str | list[str] )

Parameters

  • adapter_name (str or list[str]) — Name of the adapter(s) to be activated.

Set the active adapter(s).

Additionally, this function will set the specified adapters to trainable (i.e., requires_grad=True). If this is not desired, use the following code.

>>> for name, param in model_peft.named_parameters():
...     if ...:  # some check on name (ex. if 'lora' in name)
...         param.requires_grad = False

unload

< >

( )

Gets back the base model by removing all the trainable tokens modules without merging.

< > Update on GitHub