TinyLLama-v0-5M-F16-llamafile / TinyLLama-4.6M-v0.0-F16.dump.md
mofosyne's picture
sync llama.cpp
a9a7525

TinyLLama-4.6M-v0.0-F16.gguf - GGUF Internal File Dump

  • Endian: LITTLE endian

Key Value Metadata Store

There are 40 key-value pairs in this file

POS TYPE Count Key Value
1 UINT32 1 GGUF.version 3
2 UINT64 1 GGUF.tensor_count 75
3 UINT64 1 GGUF.kv_count 37
4 STRING 1 general.architecture llama
5 STRING 1 general.type model
6 STRING 1 general.name TinyLLama
7 STRING 1 general.author Maykeye
8 STRING 1 general.version v0.0
9 STRING 1 general.description This gguf is ported from a fir...M but using Llama architecture
10 STRING 1 general.quantized_by Mofosyne
11 STRING 1 general.size_label 4.6M
12 STRING 1 general.license apache-2.0
13 STRING 1 general.license.name Apache License Version 2.0, January 2004
14 STRING 1 general.license.link https://huggingface.co/dataset...ob/main/markdown/apache-2.0.md
15 STRING 1 general.url https://huggingface.co/mofosyne/TinyLLama-v0-llamafile
16 STRING 1 general.repo_url https://huggingface.co/mofosyne/TinyLLama-v0-llamafile
17 STRING 1 general.source.url https://huggingface.co/Maykeye/TinyLLama-v0
18 STRING 1 general.source.repo_url https://huggingface.co/Maykeye/TinyLLama-v0
19 [STRING] 5 general.tags [ text generation, transformer, llama, tiny, tiny model ]
20 [STRING] 1 general.languages [ en ]
21 [STRING] 2 general.datasets [ https://hugging...-GPT4-train.txt, https://hugging...-GPT4-valid.txt ]
22 UINT32 1 llama.block_count 8
23 UINT32 1 llama.context_length 2048
24 UINT32 1 llama.embedding_length 64
25 UINT32 1 llama.feed_forward_length 256
26 UINT32 1 llama.attention.head_count 16
27 FLOAT32 1 llama.attention.layer_norm_rms_epsilon 1e-06
28 UINT32 1 general.file_type 1
29 UINT32 1 llama.vocab_size 32000
30 UINT32 1 llama.rope.dimension_count 4
31 STRING 1 tokenizer.ggml.model llama
32 STRING 1 tokenizer.ggml.pre default
33 [STRING] 32000 tokenizer.ggml.tokens [ <unk>, <s>, </s>, <0x00>, <0x01>, ... ]
34 [FLOAT32] 32000 tokenizer.ggml.scores [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... ]
35 [INT32] 32000 tokenizer.ggml.token_type [ 2, 3, 3, 6, 6, 6, 6, ... ]
36 UINT32 1 tokenizer.ggml.bos_token_id 1
37 UINT32 1 tokenizer.ggml.eos_token_id 2
38 UINT32 1 tokenizer.ggml.unknown_token_id 0
39 UINT32 1 tokenizer.ggml.padding_token_id 0
40 UINT32 1 general.quantization_version 2

Tensors Overview ~5M Elements

Total number of elements in all tensors: 4621376 Elements

Tensor Data Offset

This table contains the offset and data segment relative to start of file

T_ID Tensor Layer Name Data Offset (B) Data Size (B)
0 output.weight 0xba8e0 0x3e8000
1 token_embd.weight 0x4a28e0 0x3e8000
2 blk.0.attn_norm.weight 0x88a8e0 0x100
3 blk.0.ffn_down.weight 0x88a9e0 0x8000
4 blk.0.ffn_gate.weight 0x8929e0 0x8000
5 blk.0.ffn_up.weight 0x89a9e0 0x8000
6 blk.0.ffn_norm.weight 0x8a29e0 0x100
7 blk.0.attn_k.weight 0x8a2ae0 0x2000
8 blk.0.attn_output.weight 0x8a4ae0 0x2000
9 blk.0.attn_q.weight 0x8a6ae0 0x2000
10 blk.0.attn_v.weight 0x8a8ae0 0x2000
11 blk.1.attn_norm.weight 0x8aaae0 0x100
12 blk.1.ffn_down.weight 0x8aabe0 0x8000
13 blk.1.ffn_gate.weight 0x8b2be0 0x8000
14 blk.1.ffn_up.weight 0x8babe0 0x8000
15 blk.1.ffn_norm.weight 0x8c2be0 0x100
16 blk.1.attn_k.weight 0x8c2ce0 0x2000
17 blk.1.attn_output.weight 0x8c4ce0 0x2000
18 blk.1.attn_q.weight 0x8c6ce0 0x2000
19 blk.1.attn_v.weight 0x8c8ce0 0x2000
20 blk.2.attn_norm.weight 0x8cace0 0x100
21 blk.2.ffn_down.weight 0x8cade0 0x8000
22 blk.2.ffn_gate.weight 0x8d2de0 0x8000
23 blk.2.ffn_up.weight 0x8dade0 0x8000
24 blk.2.ffn_norm.weight 0x8e2de0 0x100
25 blk.2.attn_k.weight 0x8e2ee0 0x2000
26 blk.2.attn_output.weight 0x8e4ee0 0x2000
27 blk.2.attn_q.weight 0x8e6ee0 0x2000
28 blk.2.attn_v.weight 0x8e8ee0 0x2000
29 blk.3.attn_norm.weight 0x8eaee0 0x100
30 blk.3.ffn_down.weight 0x8eafe0 0x8000
31 blk.3.ffn_gate.weight 0x8f2fe0 0x8000
32 blk.3.ffn_up.weight 0x8fafe0 0x8000
33 blk.3.ffn_norm.weight 0x902fe0 0x100
34 blk.3.attn_k.weight 0x9030e0 0x2000
35 blk.3.attn_output.weight 0x9050e0 0x2000
36 blk.3.attn_q.weight 0x9070e0 0x2000
37 blk.3.attn_v.weight 0x9090e0 0x2000
38 blk.4.attn_norm.weight 0x90b0e0 0x100
39 blk.4.ffn_down.weight 0x90b1e0 0x8000
40 blk.4.ffn_gate.weight 0x9131e0 0x8000
41 blk.4.ffn_up.weight 0x91b1e0 0x8000
42 blk.4.ffn_norm.weight 0x9231e0 0x100
43 blk.4.attn_k.weight 0x9232e0 0x2000
44 blk.4.attn_output.weight 0x9252e0 0x2000
45 blk.4.attn_q.weight 0x9272e0 0x2000
46 blk.4.attn_v.weight 0x9292e0 0x2000
47 blk.5.attn_norm.weight 0x92b2e0 0x100
48 blk.5.ffn_down.weight 0x92b3e0 0x8000
49 blk.5.ffn_gate.weight 0x9333e0 0x8000
50 blk.5.ffn_up.weight 0x93b3e0 0x8000
51 blk.5.ffn_norm.weight 0x9433e0 0x100
52 blk.5.attn_k.weight 0x9434e0 0x2000
53 blk.5.attn_output.weight 0x9454e0 0x2000
54 blk.5.attn_q.weight 0x9474e0 0x2000
55 blk.5.attn_v.weight 0x9494e0 0x2000
56 blk.6.attn_norm.weight 0x94b4e0 0x100
57 blk.6.ffn_down.weight 0x94b5e0 0x8000
58 blk.6.ffn_gate.weight 0x9535e0 0x8000
59 blk.6.ffn_up.weight 0x95b5e0 0x8000
60 blk.6.ffn_norm.weight 0x9635e0 0x100
61 blk.6.attn_k.weight 0x9636e0 0x2000
62 blk.6.attn_output.weight 0x9656e0 0x2000
63 blk.6.attn_q.weight 0x9676e0 0x2000
64 blk.6.attn_v.weight 0x9696e0 0x2000
65 blk.7.attn_norm.weight 0x96b6e0 0x100
66 blk.7.ffn_down.weight 0x96b7e0 0x8000
67 blk.7.ffn_gate.weight 0x9737e0 0x8000
68 blk.7.ffn_up.weight 0x97b7e0 0x8000
69 blk.7.ffn_norm.weight 0x9837e0 0x100
70 blk.7.attn_k.weight 0x9838e0 0x2000
71 blk.7.attn_output.weight 0x9858e0 0x2000
72 blk.7.attn_q.weight 0x9878e0 0x2000
73 blk.7.attn_v.weight 0x9898e0 0x2000
74 output_norm.weight 0x98b8e0 0x100

Base Tensor Group : ~4M Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
0 output.weight Output (W) (~2M) 2048000 64 x 32000 x 1 x 1 F16
1 token_embd.weight Token Embedding (W) (~2M) 2048000 64 x 32000 x 1 x 1 F16
74 output_norm.weight Output Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
  • Total elements in base: ( ~4M) 4096064
  • Percentage of total elements: 88.63%

Block 0 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
2 blk.0.attn_norm.weight Block 0 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
3 blk.0.ffn_down.weight Block 0 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
4 blk.0.ffn_gate.weight Block 0 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
5 blk.0.ffn_up.weight Block 0 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
6 blk.0.ffn_norm.weight Block 0 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
7 blk.0.attn_k.weight Block 0 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
8 blk.0.attn_output.weight Block 0 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
9 blk.0.attn_q.weight Block 0 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
10 blk.0.attn_v.weight Block 0 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.0: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 1 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
11 blk.1.attn_norm.weight Block 1 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
12 blk.1.ffn_down.weight Block 1 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
13 blk.1.ffn_gate.weight Block 1 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
14 blk.1.ffn_up.weight Block 1 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
15 blk.1.ffn_norm.weight Block 1 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
16 blk.1.attn_k.weight Block 1 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
17 blk.1.attn_output.weight Block 1 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
18 blk.1.attn_q.weight Block 1 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
19 blk.1.attn_v.weight Block 1 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.1: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 2 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
20 blk.2.attn_norm.weight Block 2 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
21 blk.2.ffn_down.weight Block 2 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
22 blk.2.ffn_gate.weight Block 2 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
23 blk.2.ffn_up.weight Block 2 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
24 blk.2.ffn_norm.weight Block 2 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
25 blk.2.attn_k.weight Block 2 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
26 blk.2.attn_output.weight Block 2 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
27 blk.2.attn_q.weight Block 2 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
28 blk.2.attn_v.weight Block 2 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.2: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 3 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
29 blk.3.attn_norm.weight Block 3 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
30 blk.3.ffn_down.weight Block 3 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
31 blk.3.ffn_gate.weight Block 3 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
32 blk.3.ffn_up.weight Block 3 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
33 blk.3.ffn_norm.weight Block 3 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
34 blk.3.attn_k.weight Block 3 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
35 blk.3.attn_output.weight Block 3 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
36 blk.3.attn_q.weight Block 3 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
37 blk.3.attn_v.weight Block 3 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.3: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 4 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
38 blk.4.attn_norm.weight Block 4 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
39 blk.4.ffn_down.weight Block 4 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
40 blk.4.ffn_gate.weight Block 4 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
41 blk.4.ffn_up.weight Block 4 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
42 blk.4.ffn_norm.weight Block 4 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
43 blk.4.attn_k.weight Block 4 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
44 blk.4.attn_output.weight Block 4 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
45 blk.4.attn_q.weight Block 4 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
46 blk.4.attn_v.weight Block 4 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.4: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 5 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
47 blk.5.attn_norm.weight Block 5 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
48 blk.5.ffn_down.weight Block 5 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
49 blk.5.ffn_gate.weight Block 5 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
50 blk.5.ffn_up.weight Block 5 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
51 blk.5.ffn_norm.weight Block 5 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
52 blk.5.attn_k.weight Block 5 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
53 blk.5.attn_output.weight Block 5 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
54 blk.5.attn_q.weight Block 5 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
55 blk.5.attn_v.weight Block 5 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.5: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 6 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
56 blk.6.attn_norm.weight Block 6 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
57 blk.6.ffn_down.weight Block 6 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
58 blk.6.ffn_gate.weight Block 6 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
59 blk.6.ffn_up.weight Block 6 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
60 blk.6.ffn_norm.weight Block 6 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
61 blk.6.attn_k.weight Block 6 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
62 blk.6.attn_output.weight Block 6 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
63 blk.6.attn_q.weight Block 6 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
64 blk.6.attn_v.weight Block 6 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.6: (~66K) 65664
  • Percentage of total elements: 1.42%

Block 7 Tensor Group : ~66K Elements

T_ID Tensor Layer Name Human Friendly Tensor Layer Name Elements Shape Type
65 blk.7.attn_norm.weight Block 7 Attention Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
66 blk.7.ffn_down.weight Block 7 Feed-Forward Network "Down" (W) (~16K) 16384 256 x 64 x 1 x 1 F16
67 blk.7.ffn_gate.weight Block 7 Feed-Forward Network "Gate" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
68 blk.7.ffn_up.weight Block 7 Feed-Forward Network "Up" (W) (~16K) 16384 64 x 256 x 1 x 1 F16
69 blk.7.ffn_norm.weight Block 7 Feed-Forward Network Normalization (W) ( 64) 64 64 x 1 x 1 x 1 F32
70 blk.7.attn_k.weight Block 7 Attention Key (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
71 blk.7.attn_output.weight Block 7 Attention Output (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
72 blk.7.attn_q.weight Block 7 Attention Query (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
73 blk.7.attn_v.weight Block 7 Attention Value (W) ( ~4K) 4096 64 x 64 x 1 x 1 F16
  • Total elements in blk.7: (~66K) 65664
  • Percentage of total elements: 1.42%