Optimum documentation

Add support for new model architectures

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v1.19.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Add support for new model architectures

To contribute and add support for a model architecture that is not currently supported by the optimum.graphcore library, you will have to:

  1. Make sure the original model implementation inherits from transformers.PreTrainedModel. This is not 100% needed, but it is highly recommended to have access to all the features.
  2. Create a “pipelined” version of the original class. To do that:
  1. Register the pipelined version of the class. This will enable the IPUTrainer class to automatically convert the original instance of a model to its pipelined counterpart.

Example: transformers.ViTForImageClassification to PipelinedViTForImageClassification

import poptorch
import transformers
from optimum.utils import logging
from optimum.graphcore.modeling_utils import PipelineMixin, get_layer_ipu, recomputation_checkpoint, register


logger = logging.get_logger(__name__)

@register(transformers.ViTForImageClassification)
class PipelinedViTForImageClassification(transformers.ViTForImageClassification, PipelineMixin):
    def parallelize(self):
        super().parallelize()
        logger.info("---------- Device Allocation -----------")
        logger.info("Embedding  --> IPU 0")
        self.vit.embeddings = poptorch.BeginBlock(self.vit.embeddings, "Embedding", ipu_id=0)

        layer_ipu = get_layer_ipu(self.ipu_config.layers_per_ipu, self.vit.encoder.layer)
        for index, layer in enumerate(self.vit.encoder.layer):
            if self.ipu_config.recompute_checkpoint_every_layer:
                # Put checkpoints on every encoder layer
                h = recomputation_checkpoint(layer)
                self._hooks.append(h)
            ipu = layer_ipu[index]
            logger.info(f"Encoder {index:<2} --> IPU {ipu}")
            self.vit.encoder.layer[index] = poptorch.BeginBlock(layer, f"Encoder{index}", ipu_id=ipu)

        last_ipu = self.ipu_config.ipus_per_replica - 1
        logger.info(f"Head       --> IPU {last_ipu}")
        logger.info("---------------------------------------")
        self.vit.layernorm = poptorch.BeginBlock(self.vit.layernorm, "LayerNorm", ipu_id=last_ipu)
        self.classifier = poptorch.BeginBlock(self.classifier, "Classifier", ipu_id=last_ipu)
        return self

As you can see, you can specify where each part of the model should be put by wrapping them around poptorch.BeginBlock, which takes a layer, a block name, and an IPU ID as inputs. To know which IPU ID to use, you can use the ipu_config.layers_per_ipu attribute, for more information check here

PipelineMixin

class optimum.graphcore.modeling_utils.PipelineMixin

< >

( )

parallelize

< >

( )

Transforms the model to run in an IPU pipeline.

deparallelize

< >

( )

Undoes the changes to the model done by parallelize. You should call this function before calling save_pretrained so that the model.state_dict dictionary is fully compatible with the original model.

from_transformers

< >

( model: PreTrainedModel ipu_config: IPUConfig )

Parameters

  • model (PreTrainedModel) — The model to convert to a pipelined model.
  • ipu_config (IPUConfig) — The IPUConfig instance of the pipelined model.

Creates a pipelined version of model from a PreTrainedModel instance.

from_pretrained_transformers

< >

( model_name_or_path: str ipu_config: IPUConfig *model_args **kwargs )

Parameters

  • model_name_or_path (str) — The model name or path.
  • ipu_config (IPUConfig) — The IPUConfig of the pipelined model.
  • model_args (Tuple[Any]) — The positional arguments to use when instantiating the model.
  • kwargs (Dict[str, Any]) — The keyword arguments to use when instantiating the model.

Creates a pipelined version of a model by using the from_pretrained function.

ipu_config

< >

( )

Checks that the model has an IPUConfig attached, and returns it.