salveamicus's picture
Upload sans models
ee56c4e
raw
history blame
No virus
5.5 kB
# Metadata Parsing
Given the simplicity of the format, it's very simple and efficient to fetch and parse metadata about Safetensors weights – i.e. the list of tensors, their types, and their shapes or numbers of parameters – using small [(Range) HTTP requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests).
This parsing has been implemented in JS in [`huggingface.js`](https://huggingface.co/docs/huggingface.js/main/en/hub/modules#parsesafetensorsmetadata) (sample code follows below), but it would be similar in any language.
## Example use case
There can be many potential use cases. For instance, we use it on the HuggingFace Hub to display info about models which have safetensors weights:
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/model-page-light.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/model-page-dark.png"/>
</div>
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/view-all-tensors-light.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/safetensors/view-all-tensors-dark.png"/>
</div>
## Usage
### JavaScript/TypeScript[[js]]
Using [`huggingface.js`](https://huggingface.co/docs/huggingface.js)
```ts
import { parseSafetensorsMetadata } from "@huggingface/hub";
const info = await parseSafetensorsMetadata({
repo: { type: "model", name: "bigscience/bloom" },
});
console.log(info)
// {
// sharded: true,
// index: {
// metadata: { total_size: 352494542848 },
// weight_map: {
// 'h.0.input_layernorm.bias': 'model_00002-of-00072.safetensors',
// ...
// }
// },
// headers: {
// __metadata__: {'format': 'pt'},
// 'h.2.attn.c_attn.weight': {'dtype': 'F32', 'shape': [768, 2304], 'data_offsets': [541012992, 548090880]},
// ...
// }
// }
```
Depending on whether the safetensors weights are sharded into multiple files or not, the output of the call above will be:
```ts
export type SafetensorsParseFromRepo =
| {
sharded: false;
header: SafetensorsFileHeader;
}
| {
sharded: true;
index: SafetensorsIndexJson;
headers: SafetensorsShardedHeaders;
};
```
where the underlying `types` are the following:
```ts
type FileName = string;
type TensorName = string;
type Dtype = "F64" | "F32" | "F16" | "BF16" | "I64" | "I32" | "I16" | "I8" | "U8" | "BOOL";
interface TensorInfo {
dtype: Dtype;
shape: number[];
data_offsets: [number, number];
}
type SafetensorsFileHeader = Record<TensorName, TensorInfo> & {
__metadata__: Record<string, string>;
};
interface SafetensorsIndexJson {
weight_map: Record<TensorName, FileName>;
}
export type SafetensorsShardedHeaders = Record<FileName, SafetensorsFileHeader>;
```
### Python
In this example python script, we are parsing metadata of [gpt2](https://huggingface.co/gpt2/blob/main/model.safetensors).
```python
import requests # pip install requests
import struct
def parse_single_file(url):
# Fetch the first 8 bytes of the file
headers = {'Range': 'bytes=0-7'}
response = requests.get(url, headers=headers)
# Interpret the bytes as a little-endian unsigned 64-bit integer
length_of_header = struct.unpack('<Q', response.content)[0]
# Fetch length_of_header bytes starting from the 9th byte
headers = {'Range': f'bytes=8-{7 + length_of_header}'}
response = requests.get(url, headers=headers)
# Interpret the response as a JSON object
header = response.json()
return header
url = "https://huggingface.co/gpt2/resolve/main/model.safetensors"
header = parse_single_file(url)
print(header)
# {
# "__metadata__": { "format": "pt" },
# "h.10.ln_1.weight": {
# "dtype": "F32",
# "shape": [768],
# "data_offsets": [223154176, 223157248]
# },
# ...
# }
```
## Example output
For instance, here are the number of params per dtype for a few models on the HuggingFace Hub. Also see [this issue](https://github.com/huggingface/safetensors/issues/44) for more examples of usage.
model | safetensors | params
--- | --- | ---
[gpt2](https://huggingface.co/gpt2?show_tensors=true) | single-file | { 'F32' => 137022720 }
[roberta-base](https://huggingface.co/roberta-base?show_tensors=true) | single-file | { 'F32' => 124697433, 'I64' => 514 }
[Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner?show_tensors=true) | single-file | { 'F32' => 110035205, 'I64' => 514 }
[roberta-large](https://huggingface.co/roberta-large?show_tensors=true) | single-file | { 'F32' => 355412057, 'I64' => 514 }
[distilbert-base-german-cased](https://huggingface.co/distilbert-base-german-cased?show_tensors=true) | single-file | { 'F32' => 67431550 }
[EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b?show_tensors=true) | sharded | { 'F16' => 20554568208, 'U8' => 184549376 }
[bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m?show_tensors=true) | single-file | { 'F16' => 559214592 }
[bigscience/bloom](https://huggingface.co/bigscience/bloom?show_tensors=true) | sharded | { 'BF16' => 176247271424 }
[bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b?show_tensors=true) | single-file | { 'F16' => 3002557440 }