Configuration

The base class PretrainedConfig implements the common methods for loading/saving a configuration either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace’s AWS S3 repository).

PretrainedConfig

class transformers.PretrainedConfig(**kwargs)[source]

Base class for all configuration classes. Handles a few parameters common to all models’ configurations as well as methods for loading/downloading/saving configurations.

Note

A configuration file can be loaded and saved to disk. Loading the configuration file and using this file to initialize a model does not load the model weights. It only affects the model’s configuration.

Class attributes (overridden by derived classes):
  • pretrained_config_archive_map: a python dict with shortcut names (string) as keys and url (string) of associated pretrained model configurations as values.

  • model_type: a string that identifies the model type, that we serialize into the JSON file, and that we use to recreate the correct object in AutoConfig.

Parameters
  • finetuning_task (string or None, optional, defaults to None) – Name of the task used to fine-tune the model. This can be used when converting from an original (TensorFlow or PyTorch) checkpoint.

  • num_labels (int, optional, defaults to 2) – Number of classes to use when the model is a classification model (sequences/tokens)

  • output_attentions (bool, optional, defaults to False) – Should the model returns attentions weights.

  • output_hidden_states (string, optional, defaults to False) – Should the model returns all hidden-states.

  • torchscript (bool, optional, defaults to False) – Is the model used with Torchscript (for PyTorch models).

classmethod from_dict(config_dict: Dict, **kwargs) → transformers.configuration_utils.PretrainedConfig[source]

Constructs a Config from a Python dictionary of parameters.

Parameters
  • config_dict (Dict[str, any]) – Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pre-trained checkpoint by leveraging the get_config_dict() method.

  • kwargs (Dict[str, any]) – Additional parameters from which to initialize the configuration object.

Returns

An instance of a configuration object

Return type

PretrainedConfig

classmethod from_json_file(json_file: str) → transformers.configuration_utils.PretrainedConfig[source]

Constructs a Config from the path to a json file of parameters.

Parameters

json_file (string) – Path to the JSON file containing the parameters.

Returns

An instance of a configuration object

Return type

PretrainedConfig

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs) → transformers.configuration_utils.PretrainedConfig[source]

Instantiate a PretrainedConfig (or a derived class) from a pre-trained model configuration.

Parameters
  • pretrained_model_name_or_path (string) –

    either:
    • a string with the shortcut name of a pre-trained model configuration to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model configuration that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing a configuration file saved using the save_pretrained() method, e.g.: ./my_model_directory/.

    • a path or url to a saved configuration JSON file, e.g.: ./my_model_directory/configuration.json.

  • cache_dir (string, optional) – Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • kwargs (Dict[str, any], optional) – The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

  • force_download (bool, optional, defaults to False) – Force to (re-)download the model weights and configuration files and override the cached versions if they exist.

  • resume_download (bool, optional, defaults to False) – Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

  • proxies (Dict, optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g.: {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.

  • return_unused_kwargs – (optional) bool: If False, then this function returns just the final configuration object. If True, then this functions returns a Tuple(config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: ie the part of kwargs which has not been used to update config and is otherwise ignored.

Returns

An instance of a configuration object

Return type

PretrainedConfig

Examples:

# We can't instantiate directly the base class `PretrainedConfig` so let's show the examples on a
# derived class: BertConfig
config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
config = BertConfig.from_pretrained('./test/saved_model/')  # E.g. config (or model) was saved using `save_pretrained('./test/saved_model/')`
config = BertConfig.from_pretrained('./test/saved_model/my_configuration.json')
config = BertConfig.from_pretrained('bert-base-uncased', output_attention=True, foo=False)
assert config.output_attention == True
config, unused_kwargs = BertConfig.from_pretrained('bert-base-uncased', output_attention=True,
                                                   foo=False, return_unused_kwargs=True)
assert config.output_attention == True
assert unused_kwargs == {'foo': False}
classmethod get_config_dict(pretrained_model_name_or_path: str, pretrained_config_archive_map: Optional[Dict] = None, **kwargs) → Tuple[Dict, Dict][source]

From a pretrained_model_name_or_path, resolve to a dictionary of parameters, to be used for instantiating a Config using from_dict.

Parameters
  • pretrained_model_name_or_path (string) – The identifier of the pre-trained checkpoint from which we want the dictionary of parameters.

  • pretrained_config_archive_map – (Dict[str, str], optional) Dict: A map of shortcut names to url. By default, will use the current class attribute.

Returns

The dictionary that will be used to instantiate the configuration object.

Return type

Tuple[Dict, Dict]

save_pretrained(save_directory)[source]

Save a configuration object to the directory save_directory, so that it can be re-loaded using the from_pretrained() class method.

Parameters

save_directory (string) – Directory where the configuration JSON file will be saved.

to_dict()[source]

Serializes this instance to a Python dictionary.

Returns

Dictionary of all the attributes that make up this configuration instance,

Return type

Dict[str, any]

to_json_file(json_file_path)[source]

Save this instance to a json file.

Parameters

json_file_path (string) – Path to the JSON file in which this configuration instance’s parameters will be saved.

to_json_string()[source]

Serializes this instance to a JSON string.

Returns

String containing all the attributes that make up this configuration instance in JSON format.

Return type

string