Configuration

The base class PretrainedConfig implements the common methods for loading/saving a configuration either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace’s AWS S3 repository).

PretrainedConfig

class transformers.PretrainedConfig(**kwargs)[source]

Base class for all configuration classes. Handles a few parameters common to all models’ configurations as well as methods for loading/downloading/saving configurations.

Note

A configuration file can be loaded and saved to disk. Loading the configuration file and using this file to initialize a model does not load the model weights. It only affects the model’s configuration.

Class attributes (overridden by derived classes):
  • pretrained_config_archive_map: a python dict with shortcut names (string) as keys and url (string) of associated pretrained model configurations as values.

  • model_type: a string that identifies the model type, that we serialize into the JSON file, and that we use to recreate the correct object in AutoConfig.

Parameters
  • finetuning_task (string or None, optional, defaults to None) – Name of the task used to fine-tune the model. This can be used when converting from an original (TensorFlow or PyTorch) checkpoint.

  • num_labels (int, optional, defaults to 2) – Number of classes to use when the model is a classification model (sequences/tokens)

  • output_attentions (bool, optional, defaults to False) – Should the model returns attentions weights.

  • output_hidden_states (string, optional, defaults to False) – Should the model returns all hidden-states.

  • torchscript (bool, optional, defaults to False) – Is the model used with Torchscript (for PyTorch models).

classmethod from_dict(config_dict: Dict, **kwargs) → transformers.configuration_utils.PretrainedConfig[source]

Constructs a Config from a Python dictionary of parameters.

Parameters
  • config_dict (Dict[str, any]) – Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pre-trained checkpoint by leveraging the get_config_dict() method.

  • kwargs (Dict[str, any]) – Additional parameters from which to initialize the configuration object.

Returns

An instance of a configuration object

Return type

PretrainedConfig

classmethod from_json_file(json_file: str) → transformers.configuration_utils.PretrainedConfig[source]

Constructs a Config from the path to a json file of parameters.

Parameters

json_file (string) – Path to the JSON file containing the parameters.

Returns

An instance of a configuration object

Return type

PretrainedConfig

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs) → transformers.configuration_utils.PretrainedConfig[source]

Instantiate a PretrainedConfig (or a derived class) from a pre-trained model configuration.

Parameters
  • pretrained_model_name_or_path (string) –

    either:
    • a string with the shortcut name of a pre-trained model configuration to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model configuration that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing a configuration file saved using the save_pretrained() method, e.g.: ./my_model_directory/.

    • a path or url to a saved configuration JSON file, e.g.: ./my_model_directory/configuration.json.

  • cache_dir (string, optional) – Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • kwargs (Dict[str, any], optional) – The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

  • force_download (bool, optional, defaults to False) – Force to (re-)download the model weights and configuration files and override the cached versions if they exist.

  • resume_download (bool, optional, defaults to False) – Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

  • proxies (Dict, optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g.: {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.

  • return_unused_kwargs – (optional) bool: If False, then this function returns just the final configuration object. If True, then this functions returns a Tuple(config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: ie the part of kwargs which has not been used to update config and is otherwise ignored.

Returns

An instance of a configuration object

Return type

PretrainedConfig

Examples:

# We can't instantiate directly the base class `PretrainedConfig` so let's show the examples on a
# derived class: BertConfig
config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
config = BertConfig.from_pretrained('./test/saved_model/')  # E.g. config (or model) was saved using `save_pretrained('./test/saved_model/')`
config = BertConfig.from_pretrained('./test/saved_model/my_configuration.json')
config = BertConfig.from_pretrained('bert-base-uncased', output_attention=True, foo=False)
assert config.output_attention == True
config, unused_kwargs = BertConfig.from_pretrained('bert-base-uncased', output_attention=True,
                                                   foo=False, return_unused_kwargs=True)
assert config.output_attention == True
assert unused_kwargs == {'foo': False}
classmethod get_config_dict(pretrained_model_name_or_path: str, pretrained_config_archive_map: Optional[Dict] = None, **kwargs) → Tuple[Dict, Dict][source]

From a pretrained_model_name_or_path, resolve to a dictionary of parameters, to be used for instantiating a Config using from_dict.

Parameters
  • pretrained_model_name_or_path (string) – The identifier of the pre-trained checkpoint from which we want the dictionary of parameters.

  • pretrained_config_archive_map – (Dict[str, str], optional) Dict: A map of shortcut names to url. By default, will use the current class attribute.

Returns

The dictionary that will be used to instantiate the configuration object.

Return type

Tuple[Dict, Dict]

save_pretrained(save_directory)[source]

Save a configuration object to the directory save_directory, so that it can be re-loaded using the from_pretrained() class method.

Parameters

save_directory (string) – Directory where the configuration JSON file will be saved.

to_dict()[source]

Serializes this instance to a Python dictionary.

Returns

Dictionary of all the attributes that make up this configuration instance,

Return type

Dict[str, any]

to_diff_dict()[source]

Removes all attributes from config which correspond to the default config attributes for better readability and serializes to a Python dictionary.

Returns

Dictionary of all the attributes that make up this configuration instance,

Return type

Dict[str, any]

to_json_file(json_file_path, use_diff=True)[source]

Save this instance to a json file.

Parameters
  • json_file_path (string) – Path to the JSON file in which this configuration instance’s parameters will be saved.

  • use_diff (bool) – If set to True, only the difference between the config instance and the default PretrainedConfig() is serialized to JSON file.

to_json_string(use_diff=True)[source]

Serializes this instance to a JSON string.

Parameters

use_diff (bool) – If set to True, only the difference between the config instance and the default PretrainedConfig() is serialized to JSON string.

Returns

String containing all the attributes that make up this configuration instance in JSON format.

Return type

string

update(config_dict: Dict)[source]

Updates attributes of this class with attributes from config_dict.

:param Dict[str, any]: Dictionary of attributes that shall be updated for this class.