# Metadata Parsing Given the simplicity of the format, it's very simple and efficient to fetch and parse metadata about Safetensors weights – i.e. the list of tensors, their types, and their shapes or numbers of parameters – using small [(Range) HTTP requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests). This parsing has been implemented in JS in [`huggingface.js`](https://huggingface.co/docs/huggingface.js/main/en/hub/modules#parsesafetensorsmetadata) (sample code follows below), but it would be similar in any language. ## Example use case There can be many potential use cases. For instance, we use it on the HuggingFace Hub to display info about models which have safetensors weights:
## Usage ### JavaScript/TypeScript[[js]] Using [`huggingface.js`](https://huggingface.co/docs/huggingface.js) ```ts import { parseSafetensorsMetadata } from "@huggingface/hub"; const info = await parseSafetensorsMetadata({ repo: { type: "model", name: "bigscience/bloom" }, }); console.log(info) // { // sharded: true, // index: { // metadata: { total_size: 352494542848 }, // weight_map: { // 'h.0.input_layernorm.bias': 'model_00002-of-00072.safetensors', // ... // } // }, // headers: { // __metadata__: {'format': 'pt'}, // 'h.2.attn.c_attn.weight': {'dtype': 'F32', 'shape': [768, 2304], 'data_offsets': [541012992, 548090880]}, // ... // } // } ``` Depending on whether the safetensors weights are sharded into multiple files or not, the output of the call above will be: ```ts export type SafetensorsParseFromRepo = | { sharded: false; header: SafetensorsFileHeader; } | { sharded: true; index: SafetensorsIndexJson; headers: SafetensorsShardedHeaders; }; ``` where the underlying `types` are the following: ```ts type FileName = string; type TensorName = string; type Dtype = "F64" | "F32" | "F16" | "BF16" | "I64" | "I32" | "I16" | "I8" | "U8" | "BOOL"; interface TensorInfo { dtype: Dtype; shape: number[]; data_offsets: [number, number]; } type SafetensorsFileHeader = Record & { __metadata__: Record; }; interface SafetensorsIndexJson { weight_map: Record; } export type SafetensorsShardedHeaders = Record; ``` ### Python In this example python script, we are parsing metadata of [gpt2](https://huggingface.co/gpt2/blob/main/model.safetensors). ```python import requests # pip install requests import struct def parse_single_file(url): # Fetch the first 8 bytes of the file headers = {'Range': 'bytes=0-7'} response = requests.get(url, headers=headers) # Interpret the bytes as a little-endian unsigned 64-bit integer length_of_header = struct.unpack(' 137022720 } [roberta-base](https://huggingface.co/roberta-base?show_tensors=true) | single-file | { 'F32' => 124697433, 'I64' => 514 } [Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner?show_tensors=true) | single-file | { 'F32' => 110035205, 'I64' => 514 } [roberta-large](https://huggingface.co/roberta-large?show_tensors=true) | single-file | { 'F32' => 355412057, 'I64' => 514 } [distilbert-base-german-cased](https://huggingface.co/distilbert-base-german-cased?show_tensors=true) | single-file | { 'F32' => 67431550 } [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b?show_tensors=true) | sharded | { 'F16' => 20554568208, 'U8' => 184549376 } [bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m?show_tensors=true) | single-file | { 'F16' => 559214592 } [bigscience/bloom](https://huggingface.co/bigscience/bloom?show_tensors=true) | sharded | { 'BF16' => 176247271424 } [bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b?show_tensors=true) | single-file | { 'F16' => 3002557440 }