File size: 5,499 Bytes

ee56c4e

# Metadata Parsing

Given the simplicity of the format, it's very simple and efficient to fetch and parse metadata about Safetensors weights – i.e. the list of tensors, their types, and their shapes or numbers of parameters – using small [(Range) HTTP requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests).

This parsing has been implemented in JS in [`huggingface.js`](https://huggingface.co/docs/huggingface.js/main/en/hub/modules#parsesafetensorsmetadata) (sample code follows below), but it would be similar in any language.

## Example use case

There can be many potential use cases. For instance, we use it on the HuggingFace Hub to display info about models which have safetensors weights:

<div class="flex justify-center">
    <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/model-page-light.png"/>
    <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/model-page-dark.png"/>
</div>

<div class="flex justify-center">
    <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/view-all-tensors-light.png"/>
    <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/view-all-tensors-dark.png"/>
</div>

## Usage

### JavaScript/TypeScript[[js]]

Using [`huggingface.js`](https://huggingface.co/docs/huggingface.js)

```ts
import { parseSafetensorsMetadata } from "@huggingface/hub";

const info = await parseSafetensorsMetadata({
	repo: { type: "model", name: "bigscience/bloom" },
});

console.log(info)
// {
//   sharded: true,
//   index: {
//     metadata: { total_size: 352494542848 },
//     weight_map: {
//       'h.0.input_layernorm.bias': 'model_00002-of-00072.safetensors',
//       ...
//     }
//   },
//   headers: {
//     __metadata__: {'format': 'pt'},
//     'h.2.attn.c_attn.weight': {'dtype': 'F32', 'shape': [768, 2304], 'data_offsets': [541012992, 548090880]},
//     ...
//   }
// }
```

Depending on whether the safetensors weights are sharded into multiple files or not, the output of the call above will be:

```ts
export type SafetensorsParseFromRepo =
| {
		sharded: false;
		header: SafetensorsFileHeader;
	}
| {
		sharded: true;
		index: SafetensorsIndexJson;
		headers: SafetensorsShardedHeaders;
	};
```

where the underlying `types` are the following:

```ts
type FileName = string;

type TensorName = string;
type Dtype = "F64" | "F32" | "F16" | "BF16" | "I64" | "I32" | "I16" | "I8" | "U8" | "BOOL";

interface TensorInfo {
	dtype: Dtype;
	shape: number[];
	data_offsets: [number, number];
}

type SafetensorsFileHeader = Record<TensorName, TensorInfo> & {
	__metadata__: Record<string, string>;
};

interface SafetensorsIndexJson {
	weight_map: Record<TensorName, FileName>;
}

export type SafetensorsShardedHeaders = Record<FileName, SafetensorsFileHeader>;

```

### Python

In this example python script, we are parsing metadata of [gpt2](https://huggingface.co/gpt2/blob/main/model.safetensors).

```python
import requests # pip install requests
import struct

def parse_single_file(url):
    # Fetch the first 8 bytes of the file
    headers = {'Range': 'bytes=0-7'}
    response = requests.get(url, headers=headers)
    # Interpret the bytes as a little-endian unsigned 64-bit integer
    length_of_header = struct.unpack('<Q', response.content)[0]
    # Fetch length_of_header bytes starting from the 9th byte
    headers = {'Range': f'bytes=8-{7 + length_of_header}'}
    response = requests.get(url, headers=headers)
    # Interpret the response as a JSON object
    header = response.json()
    return header

url = "https://huggingface.co/gpt2/resolve/main/model.safetensors"
header = parse_single_file(url)

print(header)
# {
#   "__metadata__": { "format": "pt" },
#   "h.10.ln_1.weight": {
#     "dtype": "F32",
#     "shape": [768],
#     "data_offsets": [223154176, 223157248]
#   },
#   ...
# }
```

## Example output

For instance, here are the number of params per dtype for a few models on the HuggingFace Hub. Also see [this issue](https://github.com/huggingface/safetensors/issues/44) for more examples of usage.

model | safetensors | params
--- | --- | ---
[gpt2](https://huggingface.co/gpt2?show_tensors=true) | single-file | { 'F32' => 137022720 }
[roberta-base](https://huggingface.co/roberta-base?show_tensors=true) | single-file | { 'F32' => 124697433, 'I64' => 514 }
[Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner?show_tensors=true) | single-file | { 'F32' => 110035205, 'I64' => 514 }
[roberta-large](https://huggingface.co/roberta-large?show_tensors=true) | single-file | { 'F32' => 355412057, 'I64' => 514 }
[distilbert-base-german-cased](https://huggingface.co/distilbert-base-german-cased?show_tensors=true) | single-file | { 'F32' => 67431550 }
[EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b?show_tensors=true) | sharded | { 'F16' => 20554568208, 'U8' => 184549376 }
[bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m?show_tensors=true) | single-file | { 'F16' => 559214592 }
[bigscience/bloom](https://huggingface.co/bigscience/bloom?show_tensors=true) | sharded | { 'BF16' => 176247271424 }
[bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b?show_tensors=true) | single-file | { 'F16' => 3002557440 }