Polytropon

Polytropon is a multitask model with a number of different LoRA adapters in its “inventory”. The model learns the correct combination of adapters from the inventory with a routing function to choose the best subset of modules for a specific task. PEFT also supports Multi-Head Adapter Routing (MHR) for Polytropon which builds on and improves the routing function by combining the adapter heads more granularly. The adapter heads are separated into disjoint blocks and a different routing function is learned for each one, allowing for more expressivity.

Combining Modular Skills in Multitask Learning

Multi-Head Adapter Routing for Cross-Task Generalization

PolyConfig

class peft.PolyConfig

< source >

( task_type: typing.Union[str, peft.utils.peft_types.TaskType, NoneType] = None peft_type: typing.Union[str, peft.utils.peft_types.PeftType, NoneType] = None auto_mapping: typing.Optional[dict] = None base_model_name_or_path: typing.Optional[str] = None revision: typing.Optional[str] = None inference_mode: bool = False r: int = 8 target_modules: Optional[Union[list[str], str]] = None exclude_modules: Optional[Union[list[str], str]] = None modules_to_save: Optional[list[str]] = None init_weights: bool = True poly_type: Literal['poly'] = 'poly' n_tasks: int = 1 n_skills: int = 4 n_splits: int = 1 )

Parameters

r (int) — Attention dimension of each Lora in Poly.
target_modules (Union[List[str],str]) — The names of the modules to apply Poly to.
exclude_modules (Optional[Union[List[str], str]]) — The names of the modules to not apply the adapter. When passing a string, a regex match will be performed. When passing a list of strings, either an exact match will be performed or it is checked if the name of the module ends with any of the passed strings.
modules_to_save (List[str]) — List of modules apart from Poly layers to be set as trainable and saved in the final checkpoint.
init_weights (bool) — Whether to perform initialization of Poly weights.
poly_type (Literal["poly"]) — The variant of the Poly module to use. Currently, only “poly” is supported.
n_tasks (int) — The number of tasks in a multitasking scenario.
n_skills (int) — The number of skills (LoRA) in each Poly layer.
n_splits (int) — The number of splits within each LoRA of a Poly layer. A value greater than 1 indicates the use of Multi-Head Routing (MHR).

This is the configuration class to store the configuration of a PolyModel.

PolyModel

class peft.PolyModel

< source >

( model peft_config: Union[PeftConfig, dict[str, PeftConfig]] adapter_name: str low_cpu_mem_usage: bool = False state_dict: Optional[dict[str, torch.Tensor]] = None )

< > Update on GitHub