File size: 652 Bytes
cfb9114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

This repo contains serialized blobs of an up projection layer of llama3-8B (oc=14336, ic=4096).
The linear layer has been quantized (GPTQ W4 Sym with group size 32) and sparsified by 50%.

```
β”œβ”€β”€ sparse_w4
β”‚   β”œβ”€β”€ linear_bitmap_int32.bin
β”‚   β”œβ”€β”€ linear_compressed_qweight_int32.bin
β”‚   β”œβ”€β”€ linear_nnz_int16.bin
β”‚   β”œβ”€β”€ linear_scales_float16.bin
β”‚   └── linear_zeros_int32.bin
```

### Usage
The following script shows how to process the blobs in python. It shows unpacking, zero location recovery, as well as weight dequantization process.
```bash
python unpack_blobs.py
```

> you can ignore `internal/`