Serialization
huggingface_hub
contains helpers to help ML libraries to serialize models weights in a standardized way. This part of the lib is still under development and will be improved in future releases. The goal is to harmonize how weights are serialized on the Hub, both to remove code duplication across libraries and to foster conventions on the Hub.
Split state dict into shards
At the moment, this module contains a single helper that takes a state dictionary (e.g. a mapping between layer names and related tensors) and split it into several shards, while creating a proper index in the process. This helper is available for torch
, tensorflow
and numpy
tensors and is designed to be easily extended to any other ML frameworks.
split_numpy_state_dict_into_shards
huggingface_hub.split_numpy_state_dict_into_shards
< source >( state_dict: Dict filename_pattern: str = 'model{suffix}.safetensors' max_shard_size: int = 5000000000 ) → StateDictSplit
Parameters
- state_dict (
Dict[str, np.ndarray]
) — The state dictionary to save. - filename_pattern (
str
, optional) — The pattern to generate the files names in which the model will be saved. Pattern must be a string that can be formatted withfilename_pattern.format(suffix=...)
and must contain the keywordsuffix
Defaults to"model{suffix}.safetensors"
. - max_shard_size (
int
orstr
, optional) — The maximum size of each shard, in bytes. Defaults to 5GB.
Returns
StateDictSplit
A StateDictSplit
object containing the shards and the index to retrieve them.
Split a model state dictionary in shards so that each shard is smaller than a given size.
The shards are determined by iterating through the state_dict
in the order of its keys. There is no optimization
made to make each shard as close as possible to the maximum size passed. For example, if the limit is 10GB and we
have tensors of sizes [6GB, 6GB, 2GB, 6GB, 2GB, 2GB] they will get sharded as [6GB], [6+2GB], [6+2+2GB] and not
[6+2+2GB], [6+2GB], [6GB].
If one of the model’s tensor is bigger than max_shard_size
, it will end up in its own shard which will have a
size greater than max_shard_size
.
split_tf_state_dict_into_shards
huggingface_hub.split_tf_state_dict_into_shards
< source >( state_dict: Dict filename_pattern: str = 'tf_model{suffix}.h5' max_shard_size: int = 5000000000 ) → StateDictSplit
Parameters
- state_dict (
Dict[str, Tensor]
) — The state dictionary to save. - filename_pattern (
str
, optional) — The pattern to generate the files names in which the model will be saved. Pattern must be a string that can be formatted withfilename_pattern.format(suffix=...)
and must contain the keywordsuffix
Defaults to"tf_model{suffix}.h5"
. - max_shard_size (
int
orstr
, optional) — The maximum size of each shard, in bytes. Defaults to 5GB.
Returns
StateDictSplit
A StateDictSplit
object containing the shards and the index to retrieve them.
Split a model state dictionary in shards so that each shard is smaller than a given size.
The shards are determined by iterating through the state_dict
in the order of its keys. There is no optimization
made to make each shard as close as possible to the maximum size passed. For example, if the limit is 10GB and we
have tensors of sizes [6GB, 6GB, 2GB, 6GB, 2GB, 2GB] they will get sharded as [6GB], [6+2GB], [6+2+2GB] and not
[6+2+2GB], [6+2GB], [6GB].
If one of the model’s tensor is bigger than max_shard_size
, it will end up in its own shard which will have a
size greater than max_shard_size
.
split_torch_state_dict_into_shards
huggingface_hub.split_torch_state_dict_into_shards
< source >( state_dict: Dict filename_pattern: str = 'model{suffix}.safetensors' max_shard_size: int = 5000000000 ) → StateDictSplit
Parameters
- state_dict (
Dict[str, torch.Tensor]
) — The state dictionary to save. - filename_pattern (
str
, optional) — The pattern to generate the files names in which the model will be saved. Pattern must be a string that can be formatted withfilename_pattern.format(suffix=...)
and must contain the keywordsuffix
Defaults to"model{suffix}.safetensors"
. - max_shard_size (
int
orstr
, optional) — The maximum size of each shard, in bytes. Defaults to 5GB.
Returns
StateDictSplit
A StateDictSplit
object containing the shards and the index to retrieve them.
Split a model state dictionary in shards so that each shard is smaller than a given size.
The shards are determined by iterating through the state_dict
in the order of its keys. There is no optimization
made to make each shard as close as possible to the maximum size passed. For example, if the limit is 10GB and we
have tensors of sizes [6GB, 6GB, 2GB, 6GB, 2GB, 2GB] they will get sharded as [6GB], [6+2GB], [6+2+2GB] and not
[6+2+2GB], [6+2GB], [6GB].
If one of the model’s tensor is bigger than max_shard_size
, it will end up in its own shard which will have a
size greater than max_shard_size
.
Example:
>>> import json
>>> import os
>>> from safetensors.torch import save_file as safe_save_file
>>> from huggingface_hub import split_torch_state_dict_into_shards
>>> def save_state_dict(state_dict: Dict[str, torch.Tensor], save_directory: str):
... state_dict_split = split_torch_state_dict_into_shards(state_dict)
... for filename, tensors in state_dict_split.filename_to_tensors.values():
... shard = {tensor: state_dict[tensor] for tensor in tensors}
... safe_save_file(
... shard,
... os.path.join(save_directory, filename),
... metadata={"format": "pt"},
... )
... if state_dict_split.is_sharded:
... index = {
... "metadata": state_dict_split.metadata,
... "weight_map": state_dict_split.tensor_to_filename,
... }
... with open(os.path.join(save_directory, "model.safetensors.index.json"), "w") as f:
... f.write(json.dumps(index, indent=2))
split_state_dict_into_shards_factory
This is the underlying factory from which each framework-specific helper is derived. In practice, you are not expected to use this factory directly except if you need to adapt it to a framework that is not yet supported. If that is the case, please let us know by opening a new issue on the huggingface_hub
repo.
huggingface_hub.split_state_dict_into_shards_factory
< source >( state_dict: Dict get_tensor_size: Callable get_storage_id: Callable = <function <lambda> at 0x7ff2da38d2d0> filename_pattern: str = 'model{suffix}.safetensors' max_shard_size: int = 5000000000 ) → StateDictSplit
Parameters
- state_dict (
Dict[str, Tensor]
) — The state dictionary to save. - get_tensor_size (
Callable[[Tensor], int]
) — A function that returns the size of a tensor in bytes. - get_storage_id (
Callable[[Tensor], Optional[Any]]
, optional) — A function that returns a unique identifier to a tensor storage. Multiple different tensors can share the same underlying storage. This identifier is guaranteed to be unique and constant for this tensor’s storage during its lifetime. Two tensor storages with non-overlapping lifetimes may have the same id. - filename_pattern (
str
, optional) — The pattern to generate the files names in which the model will be saved. Pattern must be a string that can be formatted withfilename_pattern.format(suffix=...)
and must contain the keywordsuffix
Defaults to"model{suffix}.safetensors"
. - max_shard_size (
int
orstr
, optional) — The maximum size of each shard, in bytes. Defaults to 5GB.
Returns
StateDictSplit
A StateDictSplit
object containing the shards and the index to retrieve them.
Split a model state dictionary in shards so that each shard is smaller than a given size.
The shards are determined by iterating through the state_dict
in the order of its keys. There is no optimization
made to make each shard as close as possible to the maximum size passed. For example, if the limit is 10GB and we
have tensors of sizes [6GB, 6GB, 2GB, 6GB, 2GB, 2GB] they will get sharded as [6GB], [6+2GB], [6+2+2GB] and not
[6+2+2GB], [6+2GB], [6GB].
If one of the model’s tensor is bigger than max_shard_size
, it will end up in its own shard which will have a
size greater than max_shard_size
.