acalatrava commited on
Commit
c0bb06f
1 Parent(s): 8587f49

initial commit

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ library_name: transformers
7
+ widget:
8
+ - text: "<s> [|User|] Hi 👋 </s>[|Assistant|]"
9
+ ---
10
+
11
+ *NOTE: This is the MLC-LLM version of the following model*
12
+
13
+ ## MiniChat-3B
14
+
15
+ 📑 [arXiv](https://arxiv.org/abs/2311.07052) | 👻 [GitHub](https://github.com/GeneZC/MiniMA) | 🤗 [HuggingFace-MiniMA](https://huggingface.co/GeneZC/MiniMA-3B) | 🤗 [HuggingFace-MiniChat](https://huggingface.co/GeneZC/MiniChat-3B) | 🤗 [HuggingFace-MiniChat-1.5](https://huggingface.co/GeneZC/MiniChat-1.5-3B) | 🤖 [ModelScope-MiniMA](https://modelscope.cn/models/GeneZC/MiniMA-3B) | 🤖 [ModelScope-MiniChat](https://modelscope.cn/models/GeneZC/MiniChat-3B)
16
+
17
+ 🆕 **Updates: MiniChat-1.5-3B**
18
+
19
+ ❗ Must comply with LICENSE of LLaMA2 since it is derived from LLaMA2.
20
+
21
+ A language model distilled and finetuned from an adapted version of LLaMA2-7B following "Towards the Law of Capacity Gap in Distilling Language Models".
22
+
23
+ Outperforming a wide range of 3B competitors in GPT4 evaluation and even competing with several 7B chat models.
24
+
25
+ <img src="./teaser_b.jpg" alt="teaser_b" width="687" />
26
+
27
+ The following is an example code snippet to use MiniChat-3B:
28
+
29
+ ```python
30
+ import torch
31
+
32
+ from transformers import AutoModelForCausalLM, AutoTokenizer
33
+
34
+ from conversation import get_default_conv_template
35
+
36
+ # MiniChat
37
+ tokenizer = AutoTokenizer.from_pretrained("GeneZC/MiniChat-3B", use_fast=False)
38
+ # GPU.
39
+ model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()
40
+ # CPU.
41
+ # model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float32).eval()
42
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
43
+ conv = get_default_conv_template("minichat")
44
+
45
+ question = "Implement a program to find the common elements in two arrays without using any extra data structures."
46
+ conv.append_message(conv.roles[0], question)
47
+ conv.append_message(conv.roles[1], None)
48
+ prompt = conv.get_prompt()
49
+ input_ids = tokenizer([prompt]).input_ids
50
+ output_ids = model.generate(
51
+ torch.as_tensor(input_ids).to(device),
52
+ do_sample=True,
53
+ temperature=0.7,
54
+ max_new_tokens=1024,
55
+ )
56
+ output_ids = output_ids[0][len(input_ids[0]):]
57
+ output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()
58
+ # output: "def common_elements(arr1, arr2):\n if len(arr1) == 0:\n return []\n if len(arr2) == 0:\n return arr1\n\n common_elements = []\n for element in arr1:\n if element in arr2:\n common_elements.append(element)\n\n return common_elements"
59
+ # Multiturn conversation could be realized by continuously appending questions to `conv`.
60
+ ```
61
+
62
+ ## Bibtex
63
+
64
+ ```bibtex
65
+ @article{zhang2023law,
66
+ title={Towards the Law of Capacity Gap in Distilling Language Models},
67
+ author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
68
+ year={2023},
69
+ url={https://arxiv.org/abs/2311.07052}
70
+ }
71
+ ```
72
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
73
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_GeneZC__MiniChat-3B)
74
+
75
+ | Metric | Value |
76
+ |-----------------------|---------------------------|
77
+ | Avg. | 42.94 |
78
+ | ARC (25-shot) | 44.03 |
79
+ | HellaSwag (10-shot) | 67.19 |
80
+ | MMLU (5-shot) | 39.17 |
81
+ | TruthfulQA (0-shot) | 45.67 |
82
+ | Winogrande (5-shot) | 65.27 |
83
+ | GSM8K (5-shot) | 10.54 |
84
+ | DROP (3-shot) | 28.73 |
mlc-chat-config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_lib": "MiniChat-3B-q4f16_1",
3
+ "local_id": "MiniChat-3B-q4f16_1",
4
+ "estimated_vram_req": 2200000000,
5
+ "conv_config": {
6
+ "seps": ["</s>"],
7
+ "stop_tokens": [ 2 ],
8
+ "offset": 0,
9
+ "separator_style": 0,
10
+ "messages": [
11
+ ],
12
+ "stop_str": "</s>",
13
+ "roles": [ "[|User|]", "[|Assistant|]" ],
14
+ "role_msg_sep": ": ",
15
+ "role_empty_sep": "\n",
16
+ "system": "",
17
+ "add_bos": true,
18
+ "bos": "<s>",
19
+ "eos": "</s>",
20
+ "prefix_tokens": [],
21
+ "name": "chatml2"
22
+ },
23
+ "temperature": 0.7,
24
+ "repetition_penalty": 1.0,
25
+ "top_p": 0.95,
26
+ "mean_gen_len": 128,
27
+ "max_gen_len": 512,
28
+ "max_window_size": 2048,
29
+ "num_shards": 1,
30
+ "shift_fill_factor": 0.3,
31
+ "tokenizer_files": [
32
+ "tokenizer.model"
33
+ ],
34
+ "model_category": "llama",
35
+ "model_name": "MiniChat-3B",
36
+ "vocab_size": 49216
37
+ }
ndarray-cache.json ADDED
@@ -0,0 +1,3032 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "ParamSize": 247
4
+ },
5
+ "records": [
6
+ {
7
+ "dataPath": "params_shard_0.bin",
8
+ "format": "raw-shard",
9
+ "nbytes": 75595776,
10
+ "records": [
11
+ {
12
+ "name": "param_0",
13
+ "shape": [
14
+ 49216,
15
+ 384
16
+ ],
17
+ "dtype": "uint32",
18
+ "format": "raw",
19
+ "nbytes": 75595776,
20
+ "byteOffset": 0
21
+ }
22
+ ]
23
+ },
24
+ {
25
+ "dataPath": "params_shard_1.bin",
26
+ "format": "raw-shard",
27
+ "nbytes": 25165824,
28
+ "records": [
29
+ {
30
+ "name": "param_6",
31
+ "shape": [
32
+ 16384,
33
+ 384
34
+ ],
35
+ "dtype": "uint32",
36
+ "format": "raw",
37
+ "nbytes": 25165824,
38
+ "byteOffset": 0
39
+ }
40
+ ]
41
+ },
42
+ {
43
+ "dataPath": "params_shard_2.bin",
44
+ "format": "raw-shard",
45
+ "nbytes": 30683136,
46
+ "records": [
47
+ {
48
+ "name": "param_1",
49
+ "shape": [
50
+ 49216,
51
+ 96
52
+ ],
53
+ "dtype": "float16",
54
+ "format": "raw",
55
+ "nbytes": 9449472,
56
+ "byteOffset": 0
57
+ },
58
+ {
59
+ "name": "param_2",
60
+ "shape": [
61
+ 9216,
62
+ 384
63
+ ],
64
+ "dtype": "uint32",
65
+ "format": "raw",
66
+ "nbytes": 14155776,
67
+ "byteOffset": 9449472
68
+ },
69
+ {
70
+ "name": "param_3",
71
+ "shape": [
72
+ 9216,
73
+ 96
74
+ ],
75
+ "dtype": "float16",
76
+ "format": "raw",
77
+ "nbytes": 1769472,
78
+ "byteOffset": 23605248
79
+ },
80
+ {
81
+ "name": "param_4",
82
+ "shape": [
83
+ 3072,
84
+ 384
85
+ ],
86
+ "dtype": "uint32",
87
+ "format": "raw",
88
+ "nbytes": 4718592,
89
+ "byteOffset": 25374720
90
+ },
91
+ {
92
+ "name": "param_5",
93
+ "shape": [
94
+ 3072,
95
+ 96
96
+ ],
97
+ "dtype": "float16",
98
+ "format": "raw",
99
+ "nbytes": 589824,
100
+ "byteOffset": 30093312
101
+ }
102
+ ]
103
+ },
104
+ {
105
+ "dataPath": "params_shard_3.bin",
106
+ "format": "raw-shard",
107
+ "nbytes": 33239040,
108
+ "records": [
109
+ {
110
+ "name": "param_7",
111
+ "shape": [
112
+ 16384,
113
+ 96
114
+ ],
115
+ "dtype": "float16",
116
+ "format": "raw",
117
+ "nbytes": 3145728,
118
+ "byteOffset": 0
119
+ },
120
+ {
121
+ "name": "param_8",
122
+ "shape": [
123
+ 3072,
124
+ 1024
125
+ ],
126
+ "dtype": "uint32",
127
+ "format": "raw",
128
+ "nbytes": 12582912,
129
+ "byteOffset": 3145728
130
+ },
131
+ {
132
+ "name": "param_9",
133
+ "shape": [
134
+ 3072,
135
+ 256
136
+ ],
137
+ "dtype": "float16",
138
+ "format": "raw",
139
+ "nbytes": 1572864,
140
+ "byteOffset": 15728640
141
+ },
142
+ {
143
+ "name": "param_10",
144
+ "shape": [
145
+ 3072
146
+ ],
147
+ "dtype": "float16",
148
+ "format": "raw",
149
+ "nbytes": 6144,
150
+ "byteOffset": 17301504
151
+ },
152
+ {
153
+ "name": "param_11",
154
+ "shape": [
155
+ 3072
156
+ ],
157
+ "dtype": "float16",
158
+ "format": "raw",
159
+ "nbytes": 6144,
160
+ "byteOffset": 17307648
161
+ },
162
+ {
163
+ "name": "param_12",
164
+ "shape": [
165
+ 9216,
166
+ 384
167
+ ],
168
+ "dtype": "uint32",
169
+ "format": "raw",
170
+ "nbytes": 14155776,
171
+ "byteOffset": 17313792
172
+ },
173
+ {
174
+ "name": "param_13",
175
+ "shape": [
176
+ 9216,
177
+ 96
178
+ ],
179
+ "dtype": "float16",
180
+ "format": "raw",
181
+ "nbytes": 1769472,
182
+ "byteOffset": 31469568
183
+ }
184
+ ]
185
+ },
186
+ {
187
+ "dataPath": "params_shard_4.bin",
188
+ "format": "raw-shard",
189
+ "nbytes": 30474240,
190
+ "records": [
191
+ {
192
+ "name": "param_14",
193
+ "shape": [
194
+ 3072,
195
+ 384
196
+ ],
197
+ "dtype": "uint32",
198
+ "format": "raw",
199
+ "nbytes": 4718592,
200
+ "byteOffset": 0
201
+ },
202
+ {
203
+ "name": "param_15",
204
+ "shape": [
205
+ 3072,
206
+ 96
207
+ ],
208
+ "dtype": "float16",
209
+ "format": "raw",
210
+ "nbytes": 589824,
211
+ "byteOffset": 4718592
212
+ },
213
+ {
214
+ "name": "param_16",
215
+ "shape": [
216
+ 16384,
217
+ 384
218
+ ],
219
+ "dtype": "uint32",
220
+ "format": "raw",
221
+ "nbytes": 25165824,
222
+ "byteOffset": 5308416
223
+ }
224
+ ]
225
+ },
226
+ {
227
+ "dataPath": "params_shard_5.bin",
228
+ "format": "raw-shard",
229
+ "nbytes": 33239040,
230
+ "records": [
231
+ {
232
+ "name": "param_17",
233
+ "shape": [
234
+ 16384,
235
+ 96
236
+ ],
237
+ "dtype": "float16",
238
+ "format": "raw",
239
+ "nbytes": 3145728,
240
+ "byteOffset": 0
241
+ },
242
+ {
243
+ "name": "param_18",
244
+ "shape": [
245
+ 3072,
246
+ 1024
247
+ ],
248
+ "dtype": "uint32",
249
+ "format": "raw",
250
+ "nbytes": 12582912,
251
+ "byteOffset": 3145728
252
+ },
253
+ {
254
+ "name": "param_19",
255
+ "shape": [
256
+ 3072,
257
+ 256
258
+ ],
259
+ "dtype": "float16",
260
+ "format": "raw",
261
+ "nbytes": 1572864,
262
+ "byteOffset": 15728640
263
+ },
264
+ {
265
+ "name": "param_20",
266
+ "shape": [
267
+ 3072
268
+ ],
269
+ "dtype": "float16",
270
+ "format": "raw",
271
+ "nbytes": 6144,
272
+ "byteOffset": 17301504
273
+ },
274
+ {
275
+ "name": "param_21",
276
+ "shape": [
277
+ 3072
278
+ ],
279
+ "dtype": "float16",
280
+ "format": "raw",
281
+ "nbytes": 6144,
282
+ "byteOffset": 17307648
283
+ },
284
+ {
285
+ "name": "param_22",
286
+ "shape": [
287
+ 9216,
288
+ 384
289
+ ],
290
+ "dtype": "uint32",
291
+ "format": "raw",
292
+ "nbytes": 14155776,
293
+ "byteOffset": 17313792
294
+ },
295
+ {
296
+ "name": "param_23",
297
+ "shape": [
298
+ 9216,
299
+ 96
300
+ ],
301
+ "dtype": "float16",
302
+ "format": "raw",
303
+ "nbytes": 1769472,
304
+ "byteOffset": 31469568
305
+ }
306
+ ]
307
+ },
308
+ {
309
+ "dataPath": "params_shard_6.bin",
310
+ "format": "raw-shard",
311
+ "nbytes": 30474240,
312
+ "records": [
313
+ {
314
+ "name": "param_24",
315
+ "shape": [
316
+ 3072,
317
+ 384
318
+ ],
319
+ "dtype": "uint32",
320
+ "format": "raw",
321
+ "nbytes": 4718592,
322
+ "byteOffset": 0
323
+ },
324
+ {
325
+ "name": "param_25",
326
+ "shape": [
327
+ 3072,
328
+ 96
329
+ ],
330
+ "dtype": "float16",
331
+ "format": "raw",
332
+ "nbytes": 589824,
333
+ "byteOffset": 4718592
334
+ },
335
+ {
336
+ "name": "param_26",
337
+ "shape": [
338
+ 16384,
339
+ 384
340
+ ],
341
+ "dtype": "uint32",
342
+ "format": "raw",
343
+ "nbytes": 25165824,
344
+ "byteOffset": 5308416
345
+ }
346
+ ]
347
+ },
348
+ {
349
+ "dataPath": "params_shard_7.bin",
350
+ "format": "raw-shard",
351
+ "nbytes": 33239040,
352
+ "records": [
353
+ {
354
+ "name": "param_27",
355
+ "shape": [
356
+ 16384,
357
+ 96
358
+ ],
359
+ "dtype": "float16",
360
+ "format": "raw",
361
+ "nbytes": 3145728,
362
+ "byteOffset": 0
363
+ },
364
+ {
365
+ "name": "param_28",
366
+ "shape": [
367
+ 3072,
368
+ 1024
369
+ ],
370
+ "dtype": "uint32",
371
+ "format": "raw",
372
+ "nbytes": 12582912,
373
+ "byteOffset": 3145728
374
+ },
375
+ {
376
+ "name": "param_29",
377
+ "shape": [
378
+ 3072,
379
+ 256
380
+ ],
381
+ "dtype": "float16",
382
+ "format": "raw",
383
+ "nbytes": 1572864,
384
+ "byteOffset": 15728640
385
+ },
386
+ {
387
+ "name": "param_30",
388
+ "shape": [
389
+ 3072
390
+ ],
391
+ "dtype": "float16",
392
+ "format": "raw",
393
+ "nbytes": 6144,
394
+ "byteOffset": 17301504
395
+ },
396
+ {
397
+ "name": "param_31",
398
+ "shape": [
399
+ 3072
400
+ ],
401
+ "dtype": "float16",
402
+ "format": "raw",
403
+ "nbytes": 6144,
404
+ "byteOffset": 17307648
405
+ },
406
+ {
407
+ "name": "param_32",
408
+ "shape": [
409
+ 9216,
410
+ 384
411
+ ],
412
+ "dtype": "uint32",
413
+ "format": "raw",
414
+ "nbytes": 14155776,
415
+ "byteOffset": 17313792
416
+ },
417
+ {
418
+ "name": "param_33",
419
+ "shape": [
420
+ 9216,
421
+ 96
422
+ ],
423
+ "dtype": "float16",
424
+ "format": "raw",
425
+ "nbytes": 1769472,
426
+ "byteOffset": 31469568
427
+ }
428
+ ]
429
+ },
430
+ {
431
+ "dataPath": "params_shard_8.bin",
432
+ "format": "raw-shard",
433
+ "nbytes": 30474240,
434
+ "records": [
435
+ {
436
+ "name": "param_34",
437
+ "shape": [
438
+ 3072,
439
+ 384
440
+ ],
441
+ "dtype": "uint32",
442
+ "format": "raw",
443
+ "nbytes": 4718592,
444
+ "byteOffset": 0
445
+ },
446
+ {
447
+ "name": "param_35",
448
+ "shape": [
449
+ 3072,
450
+ 96
451
+ ],
452
+ "dtype": "float16",
453
+ "format": "raw",
454
+ "nbytes": 589824,
455
+ "byteOffset": 4718592
456
+ },
457
+ {
458
+ "name": "param_36",
459
+ "shape": [
460
+ 16384,
461
+ 384
462
+ ],
463
+ "dtype": "uint32",
464
+ "format": "raw",
465
+ "nbytes": 25165824,
466
+ "byteOffset": 5308416
467
+ }
468
+ ]
469
+ },
470
+ {
471
+ "dataPath": "params_shard_9.bin",
472
+ "format": "raw-shard",
473
+ "nbytes": 33239040,
474
+ "records": [
475
+ {
476
+ "name": "param_37",
477
+ "shape": [
478
+ 16384,
479
+ 96
480
+ ],
481
+ "dtype": "float16",
482
+ "format": "raw",
483
+ "nbytes": 3145728,
484
+ "byteOffset": 0
485
+ },
486
+ {
487
+ "name": "param_38",
488
+ "shape": [
489
+ 3072,
490
+ 1024
491
+ ],
492
+ "dtype": "uint32",
493
+ "format": "raw",
494
+ "nbytes": 12582912,
495
+ "byteOffset": 3145728
496
+ },
497
+ {
498
+ "name": "param_39",
499
+ "shape": [
500
+ 3072,
501
+ 256
502
+ ],
503
+ "dtype": "float16",
504
+ "format": "raw",
505
+ "nbytes": 1572864,
506
+ "byteOffset": 15728640
507
+ },
508
+ {
509
+ "name": "param_40",
510
+ "shape": [
511
+ 3072
512
+ ],
513
+ "dtype": "float16",
514
+ "format": "raw",
515
+ "nbytes": 6144,
516
+ "byteOffset": 17301504
517
+ },
518
+ {
519
+ "name": "param_41",
520
+ "shape": [
521
+ 3072
522
+ ],
523
+ "dtype": "float16",
524
+ "format": "raw",
525
+ "nbytes": 6144,
526
+ "byteOffset": 17307648
527
+ },
528
+ {
529
+ "name": "param_42",
530
+ "shape": [
531
+ 9216,
532
+ 384
533
+ ],
534
+ "dtype": "uint32",
535
+ "format": "raw",
536
+ "nbytes": 14155776,
537
+ "byteOffset": 17313792
538
+ },
539
+ {
540
+ "name": "param_43",
541
+ "shape": [
542
+ 9216,
543
+ 96
544
+ ],
545
+ "dtype": "float16",
546
+ "format": "raw",
547
+ "nbytes": 1769472,
548
+ "byteOffset": 31469568
549
+ }
550
+ ]
551
+ },
552
+ {
553
+ "dataPath": "params_shard_10.bin",
554
+ "format": "raw-shard",
555
+ "nbytes": 30474240,
556
+ "records": [
557
+ {
558
+ "name": "param_44",
559
+ "shape": [
560
+ 3072,
561
+ 384
562
+ ],
563
+ "dtype": "uint32",
564
+ "format": "raw",
565
+ "nbytes": 4718592,
566
+ "byteOffset": 0
567
+ },
568
+ {
569
+ "name": "param_45",
570
+ "shape": [
571
+ 3072,
572
+ 96
573
+ ],
574
+ "dtype": "float16",
575
+ "format": "raw",
576
+ "nbytes": 589824,
577
+ "byteOffset": 4718592
578
+ },
579
+ {
580
+ "name": "param_46",
581
+ "shape": [
582
+ 16384,
583
+ 384
584
+ ],
585
+ "dtype": "uint32",
586
+ "format": "raw",
587
+ "nbytes": 25165824,
588
+ "byteOffset": 5308416
589
+ }
590
+ ]
591
+ },
592
+ {
593
+ "dataPath": "params_shard_11.bin",
594
+ "format": "raw-shard",
595
+ "nbytes": 33239040,
596
+ "records": [
597
+ {
598
+ "name": "param_47",
599
+ "shape": [
600
+ 16384,
601
+ 96
602
+ ],
603
+ "dtype": "float16",
604
+ "format": "raw",
605
+ "nbytes": 3145728,
606
+ "byteOffset": 0
607
+ },
608
+ {
609
+ "name": "param_48",
610
+ "shape": [
611
+ 3072,
612
+ 1024
613
+ ],
614
+ "dtype": "uint32",
615
+ "format": "raw",
616
+ "nbytes": 12582912,
617
+ "byteOffset": 3145728
618
+ },
619
+ {
620
+ "name": "param_49",
621
+ "shape": [
622
+ 3072,
623
+ 256
624
+ ],
625
+ "dtype": "float16",
626
+ "format": "raw",
627
+ "nbytes": 1572864,
628
+ "byteOffset": 15728640
629
+ },
630
+ {
631
+ "name": "param_50",
632
+ "shape": [
633
+ 3072
634
+ ],
635
+ "dtype": "float16",
636
+ "format": "raw",
637
+ "nbytes": 6144,
638
+ "byteOffset": 17301504
639
+ },
640
+ {
641
+ "name": "param_51",
642
+ "shape": [
643
+ 3072
644
+ ],
645
+ "dtype": "float16",
646
+ "format": "raw",
647
+ "nbytes": 6144,
648
+ "byteOffset": 17307648
649
+ },
650
+ {
651
+ "name": "param_52",
652
+ "shape": [
653
+ 9216,
654
+ 384
655
+ ],
656
+ "dtype": "uint32",
657
+ "format": "raw",
658
+ "nbytes": 14155776,
659
+ "byteOffset": 17313792
660
+ },
661
+ {
662
+ "name": "param_53",
663
+ "shape": [
664
+ 9216,
665
+ 96
666
+ ],
667
+ "dtype": "float16",
668
+ "format": "raw",
669
+ "nbytes": 1769472,
670
+ "byteOffset": 31469568
671
+ }
672
+ ]
673
+ },
674
+ {
675
+ "dataPath": "params_shard_12.bin",
676
+ "format": "raw-shard",
677
+ "nbytes": 30474240,
678
+ "records": [
679
+ {
680
+ "name": "param_54",
681
+ "shape": [
682
+ 3072,
683
+ 384
684
+ ],
685
+ "dtype": "uint32",
686
+ "format": "raw",
687
+ "nbytes": 4718592,
688
+ "byteOffset": 0
689
+ },
690
+ {
691
+ "name": "param_55",
692
+ "shape": [
693
+ 3072,
694
+ 96
695
+ ],
696
+ "dtype": "float16",
697
+ "format": "raw",
698
+ "nbytes": 589824,
699
+ "byteOffset": 4718592
700
+ },
701
+ {
702
+ "name": "param_56",
703
+ "shape": [
704
+ 16384,
705
+ 384
706
+ ],
707
+ "dtype": "uint32",
708
+ "format": "raw",
709
+ "nbytes": 25165824,
710
+ "byteOffset": 5308416
711
+ }
712
+ ]
713
+ },
714
+ {
715
+ "dataPath": "params_shard_13.bin",
716
+ "format": "raw-shard",
717
+ "nbytes": 33239040,
718
+ "records": [
719
+ {
720
+ "name": "param_57",
721
+ "shape": [
722
+ 16384,
723
+ 96
724
+ ],
725
+ "dtype": "float16",
726
+ "format": "raw",
727
+ "nbytes": 3145728,
728
+ "byteOffset": 0
729
+ },
730
+ {
731
+ "name": "param_58",
732
+ "shape": [
733
+ 3072,
734
+ 1024
735
+ ],
736
+ "dtype": "uint32",
737
+ "format": "raw",
738
+ "nbytes": 12582912,
739
+ "byteOffset": 3145728
740
+ },
741
+ {
742
+ "name": "param_59",
743
+ "shape": [
744
+ 3072,
745
+ 256
746
+ ],
747
+ "dtype": "float16",
748
+ "format": "raw",
749
+ "nbytes": 1572864,
750
+ "byteOffset": 15728640
751
+ },
752
+ {
753
+ "name": "param_60",
754
+ "shape": [
755
+ 3072
756
+ ],
757
+ "dtype": "float16",
758
+ "format": "raw",
759
+ "nbytes": 6144,
760
+ "byteOffset": 17301504
761
+ },
762
+ {
763
+ "name": "param_61",
764
+ "shape": [
765
+ 3072
766
+ ],
767
+ "dtype": "float16",
768
+ "format": "raw",
769
+ "nbytes": 6144,
770
+ "byteOffset": 17307648
771
+ },
772
+ {
773
+ "name": "param_62",
774
+ "shape": [
775
+ 9216,
776
+ 384
777
+ ],
778
+ "dtype": "uint32",
779
+ "format": "raw",
780
+ "nbytes": 14155776,
781
+ "byteOffset": 17313792
782
+ },
783
+ {
784
+ "name": "param_63",
785
+ "shape": [
786
+ 9216,
787
+ 96
788
+ ],
789
+ "dtype": "float16",
790
+ "format": "raw",
791
+ "nbytes": 1769472,
792
+ "byteOffset": 31469568
793
+ }
794
+ ]
795
+ },
796
+ {
797
+ "dataPath": "params_shard_14.bin",
798
+ "format": "raw-shard",
799
+ "nbytes": 30474240,
800
+ "records": [
801
+ {
802
+ "name": "param_64",
803
+ "shape": [
804
+ 3072,
805
+ 384
806
+ ],
807
+ "dtype": "uint32",
808
+ "format": "raw",
809
+ "nbytes": 4718592,
810
+ "byteOffset": 0
811
+ },
812
+ {
813
+ "name": "param_65",
814
+ "shape": [
815
+ 3072,
816
+ 96
817
+ ],
818
+ "dtype": "float16",
819
+ "format": "raw",
820
+ "nbytes": 589824,
821
+ "byteOffset": 4718592
822
+ },
823
+ {
824
+ "name": "param_66",
825
+ "shape": [
826
+ 16384,
827
+ 384
828
+ ],
829
+ "dtype": "uint32",
830
+ "format": "raw",
831
+ "nbytes": 25165824,
832
+ "byteOffset": 5308416
833
+ }
834
+ ]
835
+ },
836
+ {
837
+ "dataPath": "params_shard_15.bin",
838
+ "format": "raw-shard",
839
+ "nbytes": 33239040,
840
+ "records": [
841
+ {
842
+ "name": "param_67",
843
+ "shape": [
844
+ 16384,
845
+ 96
846
+ ],
847
+ "dtype": "float16",
848
+ "format": "raw",
849
+ "nbytes": 3145728,
850
+ "byteOffset": 0
851
+ },
852
+ {
853
+ "name": "param_68",
854
+ "shape": [
855
+ 3072,
856
+ 1024
857
+ ],
858
+ "dtype": "uint32",
859
+ "format": "raw",
860
+ "nbytes": 12582912,
861
+ "byteOffset": 3145728
862
+ },
863
+ {
864
+ "name": "param_69",
865
+ "shape": [
866
+ 3072,
867
+ 256
868
+ ],
869
+ "dtype": "float16",
870
+ "format": "raw",
871
+ "nbytes": 1572864,
872
+ "byteOffset": 15728640
873
+ },
874
+ {
875
+ "name": "param_70",
876
+ "shape": [
877
+ 3072
878
+ ],
879
+ "dtype": "float16",
880
+ "format": "raw",
881
+ "nbytes": 6144,
882
+ "byteOffset": 17301504
883
+ },
884
+ {
885
+ "name": "param_71",
886
+ "shape": [
887
+ 3072
888
+ ],
889
+ "dtype": "float16",
890
+ "format": "raw",
891
+ "nbytes": 6144,
892
+ "byteOffset": 17307648
893
+ },
894
+ {
895
+ "name": "param_72",
896
+ "shape": [
897
+ 9216,
898
+ 384
899
+ ],
900
+ "dtype": "uint32",
901
+ "format": "raw",
902
+ "nbytes": 14155776,
903
+ "byteOffset": 17313792
904
+ },
905
+ {
906
+ "name": "param_73",
907
+ "shape": [
908
+ 9216,
909
+ 96
910
+ ],
911
+ "dtype": "float16",
912
+ "format": "raw",
913
+ "nbytes": 1769472,
914
+ "byteOffset": 31469568
915
+ }
916
+ ]
917
+ },
918
+ {
919
+ "dataPath": "params_shard_16.bin",
920
+ "format": "raw-shard",
921
+ "nbytes": 30474240,
922
+ "records": [
923
+ {
924
+ "name": "param_74",
925
+ "shape": [
926
+ 3072,
927
+ 384
928
+ ],
929
+ "dtype": "uint32",
930
+ "format": "raw",
931
+ "nbytes": 4718592,
932
+ "byteOffset": 0
933
+ },
934
+ {
935
+ "name": "param_75",
936
+ "shape": [
937
+ 3072,
938
+ 96
939
+ ],
940
+ "dtype": "float16",
941
+ "format": "raw",
942
+ "nbytes": 589824,
943
+ "byteOffset": 4718592
944
+ },
945
+ {
946
+ "name": "param_76",
947
+ "shape": [
948
+ 16384,
949
+ 384
950
+ ],
951
+ "dtype": "uint32",
952
+ "format": "raw",
953
+ "nbytes": 25165824,
954
+ "byteOffset": 5308416
955
+ }
956
+ ]
957
+ },
958
+ {
959
+ "dataPath": "params_shard_17.bin",
960
+ "format": "raw-shard",
961
+ "nbytes": 33239040,
962
+ "records": [
963
+ {
964
+ "name": "param_77",
965
+ "shape": [
966
+ 16384,
967
+ 96
968
+ ],
969
+ "dtype": "float16",
970
+ "format": "raw",
971
+ "nbytes": 3145728,
972
+ "byteOffset": 0
973
+ },
974
+ {
975
+ "name": "param_78",
976
+ "shape": [
977
+ 3072,
978
+ 1024
979
+ ],
980
+ "dtype": "uint32",
981
+ "format": "raw",
982
+ "nbytes": 12582912,
983
+ "byteOffset": 3145728
984
+ },
985
+ {
986
+ "name": "param_79",
987
+ "shape": [
988
+ 3072,
989
+ 256
990
+ ],
991
+ "dtype": "float16",
992
+ "format": "raw",
993
+ "nbytes": 1572864,
994
+ "byteOffset": 15728640
995
+ },
996
+ {
997
+ "name": "param_80",
998
+ "shape": [
999
+ 3072
1000
+ ],
1001
+ "dtype": "float16",
1002
+ "format": "raw",
1003
+ "nbytes": 6144,
1004
+ "byteOffset": 17301504
1005
+ },
1006
+ {
1007
+ "name": "param_81",
1008
+ "shape": [
1009
+ 3072
1010
+ ],
1011
+ "dtype": "float16",
1012
+ "format": "raw",
1013
+ "nbytes": 6144,
1014
+ "byteOffset": 17307648
1015
+ },
1016
+ {
1017
+ "name": "param_82",
1018
+ "shape": [
1019
+ 9216,
1020
+ 384
1021
+ ],
1022
+ "dtype": "uint32",
1023
+ "format": "raw",
1024
+ "nbytes": 14155776,
1025
+ "byteOffset": 17313792
1026
+ },
1027
+ {
1028
+ "name": "param_83",
1029
+ "shape": [
1030
+ 9216,
1031
+ 96
1032
+ ],
1033
+ "dtype": "float16",
1034
+ "format": "raw",
1035
+ "nbytes": 1769472,
1036
+ "byteOffset": 31469568
1037
+ }
1038
+ ]
1039
+ },
1040
+ {
1041
+ "dataPath": "params_shard_18.bin",
1042
+ "format": "raw-shard",
1043
+ "nbytes": 30474240,
1044
+ "records": [
1045
+ {
1046
+ "name": "param_84",
1047
+ "shape": [
1048
+ 3072,
1049
+ 384
1050
+ ],
1051
+ "dtype": "uint32",
1052
+ "format": "raw",
1053
+ "nbytes": 4718592,
1054
+ "byteOffset": 0
1055
+ },
1056
+ {
1057
+ "name": "param_85",
1058
+ "shape": [
1059
+ 3072,
1060
+ 96
1061
+ ],
1062
+ "dtype": "float16",
1063
+ "format": "raw",
1064
+ "nbytes": 589824,
1065
+ "byteOffset": 4718592
1066
+ },
1067
+ {
1068
+ "name": "param_86",
1069
+ "shape": [
1070
+ 16384,
1071
+ 384
1072
+ ],
1073
+ "dtype": "uint32",
1074
+ "format": "raw",
1075
+ "nbytes": 25165824,
1076
+ "byteOffset": 5308416
1077
+ }
1078
+ ]
1079
+ },
1080
+ {
1081
+ "dataPath": "params_shard_19.bin",
1082
+ "format": "raw-shard",
1083
+ "nbytes": 33239040,
1084
+ "records": [
1085
+ {
1086
+ "name": "param_87",
1087
+ "shape": [
1088
+ 16384,
1089
+ 96
1090
+ ],
1091
+ "dtype": "float16",
1092
+ "format": "raw",
1093
+ "nbytes": 3145728,
1094
+ "byteOffset": 0
1095
+ },
1096
+ {
1097
+ "name": "param_88",
1098
+ "shape": [
1099
+ 3072,
1100
+ 1024
1101
+ ],
1102
+ "dtype": "uint32",
1103
+ "format": "raw",
1104
+ "nbytes": 12582912,
1105
+ "byteOffset": 3145728
1106
+ },
1107
+ {
1108
+ "name": "param_89",
1109
+ "shape": [
1110
+ 3072,
1111
+ 256
1112
+ ],
1113
+ "dtype": "float16",
1114
+ "format": "raw",
1115
+ "nbytes": 1572864,
1116
+ "byteOffset": 15728640
1117
+ },
1118
+ {
1119
+ "name": "param_90",
1120
+ "shape": [
1121
+ 3072
1122
+ ],
1123
+ "dtype": "float16",
1124
+ "format": "raw",
1125
+ "nbytes": 6144,
1126
+ "byteOffset": 17301504
1127
+ },
1128
+ {
1129
+ "name": "param_91",
1130
+ "shape": [
1131
+ 3072
1132
+ ],
1133
+ "dtype": "float16",
1134
+ "format": "raw",
1135
+ "nbytes": 6144,
1136
+ "byteOffset": 17307648
1137
+ },
1138
+ {
1139
+ "name": "param_92",
1140
+ "shape": [
1141
+ 9216,
1142
+ 384
1143
+ ],
1144
+ "dtype": "uint32",
1145
+ "format": "raw",
1146
+ "nbytes": 14155776,
1147
+ "byteOffset": 17313792
1148
+ },
1149
+ {
1150
+ "name": "param_93",
1151
+ "shape": [
1152
+ 9216,
1153
+ 96
1154
+ ],
1155
+ "dtype": "float16",
1156
+ "format": "raw",
1157
+ "nbytes": 1769472,
1158
+ "byteOffset": 31469568
1159
+ }
1160
+ ]
1161
+ },
1162
+ {
1163
+ "dataPath": "params_shard_20.bin",
1164
+ "format": "raw-shard",
1165
+ "nbytes": 30474240,
1166
+ "records": [
1167
+ {
1168
+ "name": "param_94",
1169
+ "shape": [
1170
+ 3072,
1171
+ 384
1172
+ ],
1173
+ "dtype": "uint32",
1174
+ "format": "raw",
1175
+ "nbytes": 4718592,
1176
+ "byteOffset": 0
1177
+ },
1178
+ {
1179
+ "name": "param_95",
1180
+ "shape": [
1181
+ 3072,
1182
+ 96
1183
+ ],
1184
+ "dtype": "float16",
1185
+ "format": "raw",
1186
+ "nbytes": 589824,
1187
+ "byteOffset": 4718592
1188
+ },
1189
+ {
1190
+ "name": "param_96",
1191
+ "shape": [
1192
+ 16384,
1193
+ 384
1194
+ ],
1195
+ "dtype": "uint32",
1196
+ "format": "raw",
1197
+ "nbytes": 25165824,
1198
+ "byteOffset": 5308416
1199
+ }
1200
+ ]
1201
+ },
1202
+ {
1203
+ "dataPath": "params_shard_21.bin",
1204
+ "format": "raw-shard",
1205
+ "nbytes": 33239040,
1206
+ "records": [
1207
+ {
1208
+ "name": "param_97",
1209
+ "shape": [
1210
+ 16384,
1211
+ 96
1212
+ ],
1213
+ "dtype": "float16",
1214
+ "format": "raw",
1215
+ "nbytes": 3145728,
1216
+ "byteOffset": 0
1217
+ },
1218
+ {
1219
+ "name": "param_98",
1220
+ "shape": [
1221
+ 3072,
1222
+ 1024
1223
+ ],
1224
+ "dtype": "uint32",
1225
+ "format": "raw",
1226
+ "nbytes": 12582912,
1227
+ "byteOffset": 3145728
1228
+ },
1229
+ {
1230
+ "name": "param_99",
1231
+ "shape": [
1232
+ 3072,
1233
+ 256
1234
+ ],
1235
+ "dtype": "float16",
1236
+ "format": "raw",
1237
+ "nbytes": 1572864,
1238
+ "byteOffset": 15728640
1239
+ },
1240
+ {
1241
+ "name": "param_100",
1242
+ "shape": [
1243
+ 3072
1244
+ ],
1245
+ "dtype": "float16",
1246
+ "format": "raw",
1247
+ "nbytes": 6144,
1248
+ "byteOffset": 17301504
1249
+ },
1250
+ {
1251
+ "name": "param_101",
1252
+ "shape": [
1253
+ 3072
1254
+ ],
1255
+ "dtype": "float16",
1256
+ "format": "raw",
1257
+ "nbytes": 6144,
1258
+ "byteOffset": 17307648
1259
+ },
1260
+ {
1261
+ "name": "param_102",
1262
+ "shape": [
1263
+ 9216,
1264
+ 384
1265
+ ],
1266
+ "dtype": "uint32",
1267
+ "format": "raw",
1268
+ "nbytes": 14155776,
1269
+ "byteOffset": 17313792
1270
+ },
1271
+ {
1272
+ "name": "param_103",
1273
+ "shape": [
1274
+ 9216,
1275
+ 96
1276
+ ],
1277
+ "dtype": "float16",
1278
+ "format": "raw",
1279
+ "nbytes": 1769472,
1280
+ "byteOffset": 31469568
1281
+ }
1282
+ ]
1283
+ },
1284
+ {
1285
+ "dataPath": "params_shard_22.bin",
1286
+ "format": "raw-shard",
1287
+ "nbytes": 30474240,
1288
+ "records": [
1289
+ {
1290
+ "name": "param_104",
1291
+ "shape": [
1292
+ 3072,
1293
+ 384
1294
+ ],
1295
+ "dtype": "uint32",
1296
+ "format": "raw",
1297
+ "nbytes": 4718592,
1298
+ "byteOffset": 0
1299
+ },
1300
+ {
1301
+ "name": "param_105",
1302
+ "shape": [
1303
+ 3072,
1304
+ 96
1305
+ ],
1306
+ "dtype": "float16",
1307
+ "format": "raw",
1308
+ "nbytes": 589824,
1309
+ "byteOffset": 4718592
1310
+ },
1311
+ {
1312
+ "name": "param_106",
1313
+ "shape": [
1314
+ 16384,
1315
+ 384
1316
+ ],
1317
+ "dtype": "uint32",
1318
+ "format": "raw",
1319
+ "nbytes": 25165824,
1320
+ "byteOffset": 5308416
1321
+ }
1322
+ ]
1323
+ },
1324
+ {
1325
+ "dataPath": "params_shard_23.bin",
1326
+ "format": "raw-shard",
1327
+ "nbytes": 33239040,
1328
+ "records": [
1329
+ {
1330
+ "name": "param_107",
1331
+ "shape": [
1332
+ 16384,
1333
+ 96
1334
+ ],
1335
+ "dtype": "float16",
1336
+ "format": "raw",
1337
+ "nbytes": 3145728,
1338
+ "byteOffset": 0
1339
+ },
1340
+ {
1341
+ "name": "param_108",
1342
+ "shape": [
1343
+ 3072,
1344
+ 1024
1345
+ ],
1346
+ "dtype": "uint32",
1347
+ "format": "raw",
1348
+ "nbytes": 12582912,
1349
+ "byteOffset": 3145728
1350
+ },
1351
+ {
1352
+ "name": "param_109",
1353
+ "shape": [
1354
+ 3072,
1355
+ 256
1356
+ ],
1357
+ "dtype": "float16",
1358
+ "format": "raw",
1359
+ "nbytes": 1572864,
1360
+ "byteOffset": 15728640
1361
+ },
1362
+ {
1363
+ "name": "param_110",
1364
+ "shape": [
1365
+ 3072
1366
+ ],
1367
+ "dtype": "float16",
1368
+ "format": "raw",
1369
+ "nbytes": 6144,
1370
+ "byteOffset": 17301504
1371
+ },
1372
+ {
1373
+ "name": "param_111",
1374
+ "shape": [
1375
+ 3072
1376
+ ],
1377
+ "dtype": "float16",
1378
+ "format": "raw",
1379
+ "nbytes": 6144,
1380
+ "byteOffset": 17307648
1381
+ },
1382
+ {
1383
+ "name": "param_112",
1384
+ "shape": [
1385
+ 9216,
1386
+ 384
1387
+ ],
1388
+ "dtype": "uint32",
1389
+ "format": "raw",
1390
+ "nbytes": 14155776,
1391
+ "byteOffset": 17313792
1392
+ },
1393
+ {
1394
+ "name": "param_113",
1395
+ "shape": [
1396
+ 9216,
1397
+ 96
1398
+ ],
1399
+ "dtype": "float16",
1400
+ "format": "raw",
1401
+ "nbytes": 1769472,
1402
+ "byteOffset": 31469568
1403
+ }
1404
+ ]
1405
+ },
1406
+ {
1407
+ "dataPath": "params_shard_24.bin",
1408
+ "format": "raw-shard",
1409
+ "nbytes": 30474240,
1410
+ "records": [
1411
+ {
1412
+ "name": "param_114",
1413
+ "shape": [
1414
+ 3072,
1415
+ 384
1416
+ ],
1417
+ "dtype": "uint32",
1418
+ "format": "raw",
1419
+ "nbytes": 4718592,
1420
+ "byteOffset": 0
1421
+ },
1422
+ {
1423
+ "name": "param_115",
1424
+ "shape": [
1425
+ 3072,
1426
+ 96
1427
+ ],
1428
+ "dtype": "float16",
1429
+ "format": "raw",
1430
+ "nbytes": 589824,
1431
+ "byteOffset": 4718592
1432
+ },
1433
+ {
1434
+ "name": "param_116",
1435
+ "shape": [
1436
+ 16384,
1437
+ 384
1438
+ ],
1439
+ "dtype": "uint32",
1440
+ "format": "raw",
1441
+ "nbytes": 25165824,
1442
+ "byteOffset": 5308416
1443
+ }
1444
+ ]
1445
+ },
1446
+ {
1447
+ "dataPath": "params_shard_25.bin",
1448
+ "format": "raw-shard",
1449
+ "nbytes": 33239040,
1450
+ "records": [
1451
+ {
1452
+ "name": "param_117",
1453
+ "shape": [
1454
+ 16384,
1455
+ 96
1456
+ ],
1457
+ "dtype": "float16",
1458
+ "format": "raw",
1459
+ "nbytes": 3145728,
1460
+ "byteOffset": 0
1461
+ },
1462
+ {
1463
+ "name": "param_118",
1464
+ "shape": [
1465
+ 3072,
1466
+ 1024
1467
+ ],
1468
+ "dtype": "uint32",
1469
+ "format": "raw",
1470
+ "nbytes": 12582912,
1471
+ "byteOffset": 3145728
1472
+ },
1473
+ {
1474
+ "name": "param_119",
1475
+ "shape": [
1476
+ 3072,
1477
+ 256
1478
+ ],
1479
+ "dtype": "float16",
1480
+ "format": "raw",
1481
+ "nbytes": 1572864,
1482
+ "byteOffset": 15728640
1483
+ },
1484
+ {
1485
+ "name": "param_120",
1486
+ "shape": [
1487
+ 3072
1488
+ ],
1489
+ "dtype": "float16",
1490
+ "format": "raw",
1491
+ "nbytes": 6144,
1492
+ "byteOffset": 17301504
1493
+ },
1494
+ {
1495
+ "name": "param_121",
1496
+ "shape": [
1497
+ 3072
1498
+ ],
1499
+ "dtype": "float16",
1500
+ "format": "raw",
1501
+ "nbytes": 6144,
1502
+ "byteOffset": 17307648
1503
+ },
1504
+ {
1505
+ "name": "param_122",
1506
+ "shape": [
1507
+ 9216,
1508
+ 384
1509
+ ],
1510
+ "dtype": "uint32",
1511
+ "format": "raw",
1512
+ "nbytes": 14155776,
1513
+ "byteOffset": 17313792
1514
+ },
1515
+ {
1516
+ "name": "param_123",
1517
+ "shape": [
1518
+ 9216,
1519
+ 96
1520
+ ],
1521
+ "dtype": "float16",
1522
+ "format": "raw",
1523
+ "nbytes": 1769472,
1524
+ "byteOffset": 31469568
1525
+ }
1526
+ ]
1527
+ },
1528
+ {
1529
+ "dataPath": "params_shard_26.bin",
1530
+ "format": "raw-shard",
1531
+ "nbytes": 30474240,
1532
+ "records": [
1533
+ {
1534
+ "name": "param_124",
1535
+ "shape": [
1536
+ 3072,
1537
+ 384
1538
+ ],
1539
+ "dtype": "uint32",
1540
+ "format": "raw",
1541
+ "nbytes": 4718592,
1542
+ "byteOffset": 0
1543
+ },
1544
+ {
1545
+ "name": "param_125",
1546
+ "shape": [
1547
+ 3072,
1548
+ 96
1549
+ ],
1550
+ "dtype": "float16",
1551
+ "format": "raw",
1552
+ "nbytes": 589824,
1553
+ "byteOffset": 4718592
1554
+ },
1555
+ {
1556
+ "name": "param_126",
1557
+ "shape": [
1558
+ 16384,
1559
+ 384
1560
+ ],
1561
+ "dtype": "uint32",
1562
+ "format": "raw",
1563
+ "nbytes": 25165824,
1564
+ "byteOffset": 5308416
1565
+ }
1566
+ ]
1567
+ },
1568
+ {
1569
+ "dataPath": "params_shard_27.bin",
1570
+ "format": "raw-shard",
1571
+ "nbytes": 33239040,
1572
+ "records": [
1573
+ {
1574
+ "name": "param_127",
1575
+ "shape": [
1576
+ 16384,
1577
+ 96
1578
+ ],
1579
+ "dtype": "float16",
1580
+ "format": "raw",
1581
+ "nbytes": 3145728,
1582
+ "byteOffset": 0
1583
+ },
1584
+ {
1585
+ "name": "param_128",
1586
+ "shape": [
1587
+ 3072,
1588
+ 1024
1589
+ ],
1590
+ "dtype": "uint32",
1591
+ "format": "raw",
1592
+ "nbytes": 12582912,
1593
+ "byteOffset": 3145728
1594
+ },
1595
+ {
1596
+ "name": "param_129",
1597
+ "shape": [
1598
+ 3072,
1599
+ 256
1600
+ ],
1601
+ "dtype": "float16",
1602
+ "format": "raw",
1603
+ "nbytes": 1572864,
1604
+ "byteOffset": 15728640
1605
+ },
1606
+ {
1607
+ "name": "param_130",
1608
+ "shape": [
1609
+ 3072
1610
+ ],
1611
+ "dtype": "float16",
1612
+ "format": "raw",
1613
+ "nbytes": 6144,
1614
+ "byteOffset": 17301504
1615
+ },
1616
+ {
1617
+ "name": "param_131",
1618
+ "shape": [
1619
+ 3072
1620
+ ],
1621
+ "dtype": "float16",
1622
+ "format": "raw",
1623
+ "nbytes": 6144,
1624
+ "byteOffset": 17307648
1625
+ },
1626
+ {
1627
+ "name": "param_132",
1628
+ "shape": [
1629
+ 9216,
1630
+ 384
1631
+ ],
1632
+ "dtype": "uint32",
1633
+ "format": "raw",
1634
+ "nbytes": 14155776,
1635
+ "byteOffset": 17313792
1636
+ },
1637
+ {
1638
+ "name": "param_133",
1639
+ "shape": [
1640
+ 9216,
1641
+ 96
1642
+ ],
1643
+ "dtype": "float16",
1644
+ "format": "raw",
1645
+ "nbytes": 1769472,
1646
+ "byteOffset": 31469568
1647
+ }
1648
+ ]
1649
+ },
1650
+ {
1651
+ "dataPath": "params_shard_28.bin",
1652
+ "format": "raw-shard",
1653
+ "nbytes": 30474240,
1654
+ "records": [
1655
+ {
1656
+ "name": "param_134",
1657
+ "shape": [
1658
+ 3072,
1659
+ 384
1660
+ ],
1661
+ "dtype": "uint32",
1662
+ "format": "raw",
1663
+ "nbytes": 4718592,
1664
+ "byteOffset": 0
1665
+ },
1666
+ {
1667
+ "name": "param_135",
1668
+ "shape": [
1669
+ 3072,
1670
+ 96
1671
+ ],
1672
+ "dtype": "float16",
1673
+ "format": "raw",
1674
+ "nbytes": 589824,
1675
+ "byteOffset": 4718592
1676
+ },
1677
+ {
1678
+ "name": "param_136",
1679
+ "shape": [
1680
+ 16384,
1681
+ 384
1682
+ ],
1683
+ "dtype": "uint32",
1684
+ "format": "raw",
1685
+ "nbytes": 25165824,
1686
+ "byteOffset": 5308416
1687
+ }
1688
+ ]
1689
+ },
1690
+ {
1691
+ "dataPath": "params_shard_29.bin",
1692
+ "format": "raw-shard",
1693
+ "nbytes": 33239040,
1694
+ "records": [
1695
+ {
1696
+ "name": "param_137",
1697
+ "shape": [
1698
+ 16384,
1699
+ 96
1700
+ ],
1701
+ "dtype": "float16",
1702
+ "format": "raw",
1703
+ "nbytes": 3145728,
1704
+ "byteOffset": 0
1705
+ },
1706
+ {
1707
+ "name": "param_138",
1708
+ "shape": [
1709
+ 3072,
1710
+ 1024
1711
+ ],
1712
+ "dtype": "uint32",
1713
+ "format": "raw",
1714
+ "nbytes": 12582912,
1715
+ "byteOffset": 3145728
1716
+ },
1717
+ {
1718
+ "name": "param_139",
1719
+ "shape": [
1720
+ 3072,
1721
+ 256
1722
+ ],
1723
+ "dtype": "float16",
1724
+ "format": "raw",
1725
+ "nbytes": 1572864,
1726
+ "byteOffset": 15728640
1727
+ },
1728
+ {
1729
+ "name": "param_140",
1730
+ "shape": [
1731
+ 3072
1732
+ ],
1733
+ "dtype": "float16",
1734
+ "format": "raw",
1735
+ "nbytes": 6144,
1736
+ "byteOffset": 17301504
1737
+ },
1738
+ {
1739
+ "name": "param_141",
1740
+ "shape": [
1741
+ 3072
1742
+ ],
1743
+ "dtype": "float16",
1744
+ "format": "raw",
1745
+ "nbytes": 6144,
1746
+ "byteOffset": 17307648
1747
+ },
1748
+ {
1749
+ "name": "param_142",
1750
+ "shape": [
1751
+ 9216,
1752
+ 384
1753
+ ],
1754
+ "dtype": "uint32",
1755
+ "format": "raw",
1756
+ "nbytes": 14155776,
1757
+ "byteOffset": 17313792
1758
+ },
1759
+ {
1760
+ "name": "param_143",
1761
+ "shape": [
1762
+ 9216,
1763
+ 96
1764
+ ],
1765
+ "dtype": "float16",
1766
+ "format": "raw",
1767
+ "nbytes": 1769472,
1768
+ "byteOffset": 31469568
1769
+ }
1770
+ ]
1771
+ },
1772
+ {
1773
+ "dataPath": "params_shard_30.bin",
1774
+ "format": "raw-shard",
1775
+ "nbytes": 30474240,
1776
+ "records": [
1777
+ {
1778
+ "name": "param_144",
1779
+ "shape": [
1780
+ 3072,
1781
+ 384
1782
+ ],
1783
+ "dtype": "uint32",
1784
+ "format": "raw",
1785
+ "nbytes": 4718592,
1786
+ "byteOffset": 0
1787
+ },
1788
+ {
1789
+ "name": "param_145",
1790
+ "shape": [
1791
+ 3072,
1792
+ 96
1793
+ ],
1794
+ "dtype": "float16",
1795
+ "format": "raw",
1796
+ "nbytes": 589824,
1797
+ "byteOffset": 4718592
1798
+ },
1799
+ {
1800
+ "name": "param_146",
1801
+ "shape": [
1802
+ 16384,
1803
+ 384
1804
+ ],
1805
+ "dtype": "uint32",
1806
+ "format": "raw",
1807
+ "nbytes": 25165824,
1808
+ "byteOffset": 5308416
1809
+ }
1810
+ ]
1811
+ },
1812
+ {
1813
+ "dataPath": "params_shard_31.bin",
1814
+ "format": "raw-shard",
1815
+ "nbytes": 33239040,
1816
+ "records": [
1817
+ {
1818
+ "name": "param_147",
1819
+ "shape": [
1820
+ 16384,
1821
+ 96
1822
+ ],
1823
+ "dtype": "float16",
1824
+ "format": "raw",
1825
+ "nbytes": 3145728,
1826
+ "byteOffset": 0
1827
+ },
1828
+ {
1829
+ "name": "param_148",
1830
+ "shape": [
1831
+ 3072,
1832
+ 1024
1833
+ ],
1834
+ "dtype": "uint32",
1835
+ "format": "raw",
1836
+ "nbytes": 12582912,
1837
+ "byteOffset": 3145728
1838
+ },
1839
+ {
1840
+ "name": "param_149",
1841
+ "shape": [
1842
+ 3072,
1843
+ 256
1844
+ ],
1845
+ "dtype": "float16",
1846
+ "format": "raw",
1847
+ "nbytes": 1572864,
1848
+ "byteOffset": 15728640
1849
+ },
1850
+ {
1851
+ "name": "param_150",
1852
+ "shape": [
1853
+ 3072
1854
+ ],
1855
+ "dtype": "float16",
1856
+ "format": "raw",
1857
+ "nbytes": 6144,
1858
+ "byteOffset": 17301504
1859
+ },
1860
+ {
1861
+ "name": "param_151",
1862
+ "shape": [
1863
+ 3072
1864
+ ],
1865
+ "dtype": "float16",
1866
+ "format": "raw",
1867
+ "nbytes": 6144,
1868
+ "byteOffset": 17307648
1869
+ },
1870
+ {
1871
+ "name": "param_152",
1872
+ "shape": [
1873
+ 9216,
1874
+ 384
1875
+ ],
1876
+ "dtype": "uint32",
1877
+ "format": "raw",
1878
+ "nbytes": 14155776,
1879
+ "byteOffset": 17313792
1880
+ },
1881
+ {
1882
+ "name": "param_153",
1883
+ "shape": [
1884
+ 9216,
1885
+ 96
1886
+ ],
1887
+ "dtype": "float16",
1888
+ "format": "raw",
1889
+ "nbytes": 1769472,
1890
+ "byteOffset": 31469568
1891
+ }
1892
+ ]
1893
+ },
1894
+ {
1895
+ "dataPath": "params_shard_32.bin",
1896
+ "format": "raw-shard",
1897
+ "nbytes": 30474240,
1898
+ "records": [
1899
+ {
1900
+ "name": "param_154",
1901
+ "shape": [
1902
+ 3072,
1903
+ 384
1904
+ ],
1905
+ "dtype": "uint32",
1906
+ "format": "raw",
1907
+ "nbytes": 4718592,
1908
+ "byteOffset": 0
1909
+ },
1910
+ {
1911
+ "name": "param_155",
1912
+ "shape": [
1913
+ 3072,
1914
+ 96
1915
+ ],
1916
+ "dtype": "float16",
1917
+ "format": "raw",
1918
+ "nbytes": 589824,
1919
+ "byteOffset": 4718592
1920
+ },
1921
+ {
1922
+ "name": "param_156",
1923
+ "shape": [
1924
+ 16384,
1925
+ 384
1926
+ ],
1927
+ "dtype": "uint32",
1928
+ "format": "raw",
1929
+ "nbytes": 25165824,
1930
+ "byteOffset": 5308416
1931
+ }
1932
+ ]
1933
+ },
1934
+ {
1935
+ "dataPath": "params_shard_33.bin",
1936
+ "format": "raw-shard",
1937
+ "nbytes": 33239040,
1938
+ "records": [
1939
+ {
1940
+ "name": "param_157",
1941
+ "shape": [
1942
+ 16384,
1943
+ 96
1944
+ ],
1945
+ "dtype": "float16",
1946
+ "format": "raw",
1947
+ "nbytes": 3145728,
1948
+ "byteOffset": 0
1949
+ },
1950
+ {
1951
+ "name": "param_158",
1952
+ "shape": [
1953
+ 3072,
1954
+ 1024
1955
+ ],
1956
+ "dtype": "uint32",
1957
+ "format": "raw",
1958
+ "nbytes": 12582912,
1959
+ "byteOffset": 3145728
1960
+ },
1961
+ {
1962
+ "name": "param_159",
1963
+ "shape": [
1964
+ 3072,
1965
+ 256
1966
+ ],
1967
+ "dtype": "float16",
1968
+ "format": "raw",
1969
+ "nbytes": 1572864,
1970
+ "byteOffset": 15728640
1971
+ },
1972
+ {
1973
+ "name": "param_160",
1974
+ "shape": [
1975
+ 3072
1976
+ ],
1977
+ "dtype": "float16",
1978
+ "format": "raw",
1979
+ "nbytes": 6144,
1980
+ "byteOffset": 17301504
1981
+ },
1982
+ {
1983
+ "name": "param_161",
1984
+ "shape": [
1985
+ 3072
1986
+ ],
1987
+ "dtype": "float16",
1988
+ "format": "raw",
1989
+ "nbytes": 6144,
1990
+ "byteOffset": 17307648
1991
+ },
1992
+ {
1993
+ "name": "param_162",
1994
+ "shape": [
1995
+ 9216,
1996
+ 384
1997
+ ],
1998
+ "dtype": "uint32",
1999
+ "format": "raw",
2000
+ "nbytes": 14155776,
2001
+ "byteOffset": 17313792
2002
+ },
2003
+ {
2004
+ "name": "param_163",
2005
+ "shape": [
2006
+ 9216,
2007
+ 96
2008
+ ],
2009
+ "dtype": "float16",
2010
+ "format": "raw",
2011
+ "nbytes": 1769472,
2012
+ "byteOffset": 31469568
2013
+ }
2014
+ ]
2015
+ },
2016
+ {
2017
+ "dataPath": "params_shard_34.bin",
2018
+ "format": "raw-shard",
2019
+ "nbytes": 30474240,
2020
+ "records": [
2021
+ {
2022
+ "name": "param_164",
2023
+ "shape": [
2024
+ 3072,
2025
+ 384
2026
+ ],
2027
+ "dtype": "uint32",
2028
+ "format": "raw",
2029
+ "nbytes": 4718592,
2030
+ "byteOffset": 0
2031
+ },
2032
+ {
2033
+ "name": "param_165",
2034
+ "shape": [
2035
+ 3072,
2036
+ 96
2037
+ ],
2038
+ "dtype": "float16",
2039
+ "format": "raw",
2040
+ "nbytes": 589824,
2041
+ "byteOffset": 4718592
2042
+ },
2043
+ {
2044
+ "name": "param_166",
2045
+ "shape": [
2046
+ 16384,
2047
+ 384
2048
+ ],
2049
+ "dtype": "uint32",
2050
+ "format": "raw",
2051
+ "nbytes": 25165824,
2052
+ "byteOffset": 5308416
2053
+ }
2054
+ ]
2055
+ },
2056
+ {
2057
+ "dataPath": "params_shard_35.bin",
2058
+ "format": "raw-shard",
2059
+ "nbytes": 33239040,
2060
+ "records": [
2061
+ {
2062
+ "name": "param_167",
2063
+ "shape": [
2064
+ 16384,
2065
+ 96
2066
+ ],
2067
+ "dtype": "float16",
2068
+ "format": "raw",
2069
+ "nbytes": 3145728,
2070
+ "byteOffset": 0
2071
+ },
2072
+ {
2073
+ "name": "param_168",
2074
+ "shape": [
2075
+ 3072,
2076
+ 1024
2077
+ ],
2078
+ "dtype": "uint32",
2079
+ "format": "raw",
2080
+ "nbytes": 12582912,
2081
+ "byteOffset": 3145728
2082
+ },
2083
+ {
2084
+ "name": "param_169",
2085
+ "shape": [
2086
+ 3072,
2087
+ 256
2088
+ ],
2089
+ "dtype": "float16",
2090
+ "format": "raw",
2091
+ "nbytes": 1572864,
2092
+ "byteOffset": 15728640
2093
+ },
2094
+ {
2095
+ "name": "param_170",
2096
+ "shape": [
2097
+ 3072
2098
+ ],
2099
+ "dtype": "float16",
2100
+ "format": "raw",
2101
+ "nbytes": 6144,
2102
+ "byteOffset": 17301504
2103
+ },
2104
+ {
2105
+ "name": "param_171",
2106
+ "shape": [
2107
+ 3072
2108
+ ],
2109
+ "dtype": "float16",
2110
+ "format": "raw",
2111
+ "nbytes": 6144,
2112
+ "byteOffset": 17307648
2113
+ },
2114
+ {
2115
+ "name": "param_172",
2116
+ "shape": [
2117
+ 9216,
2118
+ 384
2119
+ ],
2120
+ "dtype": "uint32",
2121
+ "format": "raw",
2122
+ "nbytes": 14155776,
2123
+ "byteOffset": 17313792
2124
+ },
2125
+ {
2126
+ "name": "param_173",
2127
+ "shape": [
2128
+ 9216,
2129
+ 96
2130
+ ],
2131
+ "dtype": "float16",
2132
+ "format": "raw",
2133
+ "nbytes": 1769472,
2134
+ "byteOffset": 31469568
2135
+ }
2136
+ ]
2137
+ },
2138
+ {
2139
+ "dataPath": "params_shard_36.bin",
2140
+ "format": "raw-shard",
2141
+ "nbytes": 30474240,
2142
+ "records": [
2143
+ {
2144
+ "name": "param_174",
2145
+ "shape": [
2146
+ 3072,
2147
+ 384
2148
+ ],
2149
+ "dtype": "uint32",
2150
+ "format": "raw",
2151
+ "nbytes": 4718592,
2152
+ "byteOffset": 0
2153
+ },
2154
+ {
2155
+ "name": "param_175",
2156
+ "shape": [
2157
+ 3072,
2158
+ 96
2159
+ ],
2160
+ "dtype": "float16",
2161
+ "format": "raw",
2162
+ "nbytes": 589824,
2163
+ "byteOffset": 4718592
2164
+ },
2165
+ {
2166
+ "name": "param_176",
2167
+ "shape": [
2168
+ 16384,
2169
+ 384
2170
+ ],
2171
+ "dtype": "uint32",
2172
+ "format": "raw",
2173
+ "nbytes": 25165824,
2174
+ "byteOffset": 5308416
2175
+ }
2176
+ ]
2177
+ },
2178
+ {
2179
+ "dataPath": "params_shard_37.bin",
2180
+ "format": "raw-shard",
2181
+ "nbytes": 33239040,
2182
+ "records": [
2183
+ {
2184
+ "name": "param_177",
2185
+ "shape": [
2186
+ 16384,
2187
+ 96
2188
+ ],
2189
+ "dtype": "float16",
2190
+ "format": "raw",
2191
+ "nbytes": 3145728,
2192
+ "byteOffset": 0
2193
+ },
2194
+ {
2195
+ "name": "param_178",
2196
+ "shape": [
2197
+ 3072,
2198
+ 1024
2199
+ ],
2200
+ "dtype": "uint32",
2201
+ "format": "raw",
2202
+ "nbytes": 12582912,
2203
+ "byteOffset": 3145728
2204
+ },
2205
+ {
2206
+ "name": "param_179",
2207
+ "shape": [
2208
+ 3072,
2209
+ 256
2210
+ ],
2211
+ "dtype": "float16",
2212
+ "format": "raw",
2213
+ "nbytes": 1572864,
2214
+ "byteOffset": 15728640
2215
+ },
2216
+ {
2217
+ "name": "param_180",
2218
+ "shape": [
2219
+ 3072
2220
+ ],
2221
+ "dtype": "float16",
2222
+ "format": "raw",
2223
+ "nbytes": 6144,
2224
+ "byteOffset": 17301504
2225
+ },
2226
+ {
2227
+ "name": "param_181",
2228
+ "shape": [
2229
+ 3072
2230
+ ],
2231
+ "dtype": "float16",
2232
+ "format": "raw",
2233
+ "nbytes": 6144,
2234
+ "byteOffset": 17307648
2235
+ },
2236
+ {
2237
+ "name": "param_182",
2238
+ "shape": [
2239
+ 9216,
2240
+ 384
2241
+ ],
2242
+ "dtype": "uint32",
2243
+ "format": "raw",
2244
+ "nbytes": 14155776,
2245
+ "byteOffset": 17313792
2246
+ },
2247
+ {
2248
+ "name": "param_183",
2249
+ "shape": [
2250
+ 9216,
2251
+ 96
2252
+ ],
2253
+ "dtype": "float16",
2254
+ "format": "raw",
2255
+ "nbytes": 1769472,
2256
+ "byteOffset": 31469568
2257
+ }
2258
+ ]
2259
+ },
2260
+ {
2261
+ "dataPath": "params_shard_38.bin",
2262
+ "format": "raw-shard",
2263
+ "nbytes": 30474240,
2264
+ "records": [
2265
+ {
2266
+ "name": "param_184",
2267
+ "shape": [
2268
+ 3072,
2269
+ 384
2270
+ ],
2271
+ "dtype": "uint32",
2272
+ "format": "raw",
2273
+ "nbytes": 4718592,
2274
+ "byteOffset": 0
2275
+ },
2276
+ {
2277
+ "name": "param_185",
2278
+ "shape": [
2279
+ 3072,
2280
+ 96
2281
+ ],
2282
+ "dtype": "float16",
2283
+ "format": "raw",
2284
+ "nbytes": 589824,
2285
+ "byteOffset": 4718592
2286
+ },
2287
+ {
2288
+ "name": "param_186",
2289
+ "shape": [
2290
+ 16384,
2291
+ 384
2292
+ ],
2293
+ "dtype": "uint32",
2294
+ "format": "raw",
2295
+ "nbytes": 25165824,
2296
+ "byteOffset": 5308416
2297
+ }
2298
+ ]
2299
+ },
2300
+ {
2301
+ "dataPath": "params_shard_39.bin",
2302
+ "format": "raw-shard",
2303
+ "nbytes": 33239040,
2304
+ "records": [
2305
+ {
2306
+ "name": "param_187",
2307
+ "shape": [
2308
+ 16384,
2309
+ 96
2310
+ ],
2311
+ "dtype": "float16",
2312
+ "format": "raw",
2313
+ "nbytes": 3145728,
2314
+ "byteOffset": 0
2315
+ },
2316
+ {
2317
+ "name": "param_188",
2318
+ "shape": [
2319
+ 3072,
2320
+ 1024
2321
+ ],
2322
+ "dtype": "uint32",
2323
+ "format": "raw",
2324
+ "nbytes": 12582912,
2325
+ "byteOffset": 3145728
2326
+ },
2327
+ {
2328
+ "name": "param_189",
2329
+ "shape": [
2330
+ 3072,
2331
+ 256
2332
+ ],
2333
+ "dtype": "float16",
2334
+ "format": "raw",
2335
+ "nbytes": 1572864,
2336
+ "byteOffset": 15728640
2337
+ },
2338
+ {
2339
+ "name": "param_190",
2340
+ "shape": [
2341
+ 3072
2342
+ ],
2343
+ "dtype": "float16",
2344
+ "format": "raw",
2345
+ "nbytes": 6144,
2346
+ "byteOffset": 17301504
2347
+ },
2348
+ {
2349
+ "name": "param_191",
2350
+ "shape": [
2351
+ 3072
2352
+ ],
2353
+ "dtype": "float16",
2354
+ "format": "raw",
2355
+ "nbytes": 6144,
2356
+ "byteOffset": 17307648
2357
+ },
2358
+ {
2359
+ "name": "param_192",
2360
+ "shape": [
2361
+ 9216,
2362
+ 384
2363
+ ],
2364
+ "dtype": "uint32",
2365
+ "format": "raw",
2366
+ "nbytes": 14155776,
2367
+ "byteOffset": 17313792
2368
+ },
2369
+ {
2370
+ "name": "param_193",
2371
+ "shape": [
2372
+ 9216,
2373
+ 96
2374
+ ],
2375
+ "dtype": "float16",
2376
+ "format": "raw",
2377
+ "nbytes": 1769472,
2378
+ "byteOffset": 31469568
2379
+ }
2380
+ ]
2381
+ },
2382
+ {
2383
+ "dataPath": "params_shard_40.bin",
2384
+ "format": "raw-shard",
2385
+ "nbytes": 30474240,
2386
+ "records": [
2387
+ {
2388
+ "name": "param_194",
2389
+ "shape": [
2390
+ 3072,
2391
+ 384
2392
+ ],
2393
+ "dtype": "uint32",
2394
+ "format": "raw",
2395
+ "nbytes": 4718592,
2396
+ "byteOffset": 0
2397
+ },
2398
+ {
2399
+ "name": "param_195",
2400
+ "shape": [
2401
+ 3072,
2402
+ 96
2403
+ ],
2404
+ "dtype": "float16",
2405
+ "format": "raw",
2406
+ "nbytes": 589824,
2407
+ "byteOffset": 4718592
2408
+ },
2409
+ {
2410
+ "name": "param_196",
2411
+ "shape": [
2412
+ 16384,
2413
+ 384
2414
+ ],
2415
+ "dtype": "uint32",
2416
+ "format": "raw",
2417
+ "nbytes": 25165824,
2418
+ "byteOffset": 5308416
2419
+ }
2420
+ ]
2421
+ },
2422
+ {
2423
+ "dataPath": "params_shard_41.bin",
2424
+ "format": "raw-shard",
2425
+ "nbytes": 33239040,
2426
+ "records": [
2427
+ {
2428
+ "name": "param_197",
2429
+ "shape": [
2430
+ 16384,
2431
+ 96
2432
+ ],
2433
+ "dtype": "float16",
2434
+ "format": "raw",
2435
+ "nbytes": 3145728,
2436
+ "byteOffset": 0
2437
+ },
2438
+ {
2439
+ "name": "param_198",
2440
+ "shape": [
2441
+ 3072,
2442
+ 1024
2443
+ ],
2444
+ "dtype": "uint32",
2445
+ "format": "raw",
2446
+ "nbytes": 12582912,
2447
+ "byteOffset": 3145728
2448
+ },
2449
+ {
2450
+ "name": "param_199",
2451
+ "shape": [
2452
+ 3072,
2453
+ 256
2454
+ ],
2455
+ "dtype": "float16",
2456
+ "format": "raw",
2457
+ "nbytes": 1572864,
2458
+ "byteOffset": 15728640
2459
+ },
2460
+ {
2461
+ "name": "param_200",
2462
+ "shape": [
2463
+ 3072
2464
+ ],
2465
+ "dtype": "float16",
2466
+ "format": "raw",
2467
+ "nbytes": 6144,
2468
+ "byteOffset": 17301504
2469
+ },
2470
+ {
2471
+ "name": "param_201",
2472
+ "shape": [
2473
+ 3072
2474
+ ],
2475
+ "dtype": "float16",
2476
+ "format": "raw",
2477
+ "nbytes": 6144,
2478
+ "byteOffset": 17307648
2479
+ },
2480
+ {
2481
+ "name": "param_202",
2482
+ "shape": [
2483
+ 9216,
2484
+ 384
2485
+ ],
2486
+ "dtype": "uint32",
2487
+ "format": "raw",
2488
+ "nbytes": 14155776,
2489
+ "byteOffset": 17313792
2490
+ },
2491
+ {
2492
+ "name": "param_203",
2493
+ "shape": [
2494
+ 9216,
2495
+ 96
2496
+ ],
2497
+ "dtype": "float16",
2498
+ "format": "raw",
2499
+ "nbytes": 1769472,
2500
+ "byteOffset": 31469568
2501
+ }
2502
+ ]
2503
+ },
2504
+ {
2505
+ "dataPath": "params_shard_42.bin",
2506
+ "format": "raw-shard",
2507
+ "nbytes": 30474240,
2508
+ "records": [
2509
+ {
2510
+ "name": "param_204",
2511
+ "shape": [
2512
+ 3072,
2513
+ 384
2514
+ ],
2515
+ "dtype": "uint32",
2516
+ "format": "raw",
2517
+ "nbytes": 4718592,
2518
+ "byteOffset": 0
2519
+ },
2520
+ {
2521
+ "name": "param_205",
2522
+ "shape": [
2523
+ 3072,
2524
+ 96
2525
+ ],
2526
+ "dtype": "float16",
2527
+ "format": "raw",
2528
+ "nbytes": 589824,
2529
+ "byteOffset": 4718592
2530
+ },
2531
+ {
2532
+ "name": "param_206",
2533
+ "shape": [
2534
+ 16384,
2535
+ 384
2536
+ ],
2537
+ "dtype": "uint32",
2538
+ "format": "raw",
2539
+ "nbytes": 25165824,
2540
+ "byteOffset": 5308416
2541
+ }
2542
+ ]
2543
+ },
2544
+ {
2545
+ "dataPath": "params_shard_43.bin",
2546
+ "format": "raw-shard",
2547
+ "nbytes": 33239040,
2548
+ "records": [
2549
+ {
2550
+ "name": "param_207",
2551
+ "shape": [
2552
+ 16384,
2553
+ 96
2554
+ ],
2555
+ "dtype": "float16",
2556
+ "format": "raw",
2557
+ "nbytes": 3145728,
2558
+ "byteOffset": 0
2559
+ },
2560
+ {
2561
+ "name": "param_208",
2562
+ "shape": [
2563
+ 3072,
2564
+ 1024
2565
+ ],
2566
+ "dtype": "uint32",
2567
+ "format": "raw",
2568
+ "nbytes": 12582912,
2569
+ "byteOffset": 3145728
2570
+ },
2571
+ {
2572
+ "name": "param_209",
2573
+ "shape": [
2574
+ 3072,
2575
+ 256
2576
+ ],
2577
+ "dtype": "float16",
2578
+ "format": "raw",
2579
+ "nbytes": 1572864,
2580
+ "byteOffset": 15728640
2581
+ },
2582
+ {
2583
+ "name": "param_210",
2584
+ "shape": [
2585
+ 3072
2586
+ ],
2587
+ "dtype": "float16",
2588
+ "format": "raw",
2589
+ "nbytes": 6144,
2590
+ "byteOffset": 17301504
2591
+ },
2592
+ {
2593
+ "name": "param_211",
2594
+ "shape": [
2595
+ 3072
2596
+ ],
2597
+ "dtype": "float16",
2598
+ "format": "raw",
2599
+ "nbytes": 6144,
2600
+ "byteOffset": 17307648
2601
+ },
2602
+ {
2603
+ "name": "param_212",
2604
+ "shape": [
2605
+ 9216,
2606
+ 384
2607
+ ],
2608
+ "dtype": "uint32",
2609
+ "format": "raw",
2610
+ "nbytes": 14155776,
2611
+ "byteOffset": 17313792
2612
+ },
2613
+ {
2614
+ "name": "param_213",
2615
+ "shape": [
2616
+ 9216,
2617
+ 96
2618
+ ],
2619
+ "dtype": "float16",
2620
+ "format": "raw",
2621
+ "nbytes": 1769472,
2622
+ "byteOffset": 31469568
2623
+ }
2624
+ ]
2625
+ },
2626
+ {
2627
+ "dataPath": "params_shard_44.bin",
2628
+ "format": "raw-shard",
2629
+ "nbytes": 30474240,
2630
+ "records": [
2631
+ {
2632
+ "name": "param_214",
2633
+ "shape": [
2634
+ 3072,
2635
+ 384
2636
+ ],
2637
+ "dtype": "uint32",
2638
+ "format": "raw",
2639
+ "nbytes": 4718592,
2640
+ "byteOffset": 0
2641
+ },
2642
+ {
2643
+ "name": "param_215",
2644
+ "shape": [
2645
+ 3072,
2646
+ 96
2647
+ ],
2648
+ "dtype": "float16",
2649
+ "format": "raw",
2650
+ "nbytes": 589824,
2651
+ "byteOffset": 4718592
2652
+ },
2653
+ {
2654
+ "name": "param_216",
2655
+ "shape": [
2656
+ 16384,
2657
+ 384
2658
+ ],
2659
+ "dtype": "uint32",
2660
+ "format": "raw",
2661
+ "nbytes": 25165824,
2662
+ "byteOffset": 5308416
2663
+ }
2664
+ ]
2665
+ },
2666
+ {
2667
+ "dataPath": "params_shard_45.bin",
2668
+ "format": "raw-shard",
2669
+ "nbytes": 33239040,
2670
+ "records": [
2671
+ {
2672
+ "name": "param_217",
2673
+ "shape": [
2674
+ 16384,
2675
+ 96
2676
+ ],
2677
+ "dtype": "float16",
2678
+ "format": "raw",
2679
+ "nbytes": 3145728,
2680
+ "byteOffset": 0
2681
+ },
2682
+ {
2683
+ "name": "param_218",
2684
+ "shape": [
2685
+ 3072,
2686
+ 1024
2687
+ ],
2688
+ "dtype": "uint32",
2689
+ "format": "raw",
2690
+ "nbytes": 12582912,
2691
+ "byteOffset": 3145728
2692
+ },
2693
+ {
2694
+ "name": "param_219",
2695
+ "shape": [
2696
+ 3072,
2697
+ 256
2698
+ ],
2699
+ "dtype": "float16",
2700
+ "format": "raw",
2701
+ "nbytes": 1572864,
2702
+ "byteOffset": 15728640
2703
+ },
2704
+ {
2705
+ "name": "param_220",
2706
+ "shape": [
2707
+ 3072
2708
+ ],
2709
+ "dtype": "float16",
2710
+ "format": "raw",
2711
+ "nbytes": 6144,
2712
+ "byteOffset": 17301504
2713
+ },
2714
+ {
2715
+ "name": "param_221",
2716
+ "shape": [
2717
+ 3072
2718
+ ],
2719
+ "dtype": "float16",
2720
+ "format": "raw",
2721
+ "nbytes": 6144,
2722
+ "byteOffset": 17307648
2723
+ },
2724
+ {
2725
+ "name": "param_222",
2726
+ "shape": [
2727
+ 9216,
2728
+ 384
2729
+ ],
2730
+ "dtype": "uint32",
2731
+ "format": "raw",
2732
+ "nbytes": 14155776,
2733
+ "byteOffset": 17313792
2734
+ },
2735
+ {
2736
+ "name": "param_223",
2737
+ "shape": [
2738
+ 9216,
2739
+ 96
2740
+ ],
2741
+ "dtype": "float16",
2742
+ "format": "raw",
2743
+ "nbytes": 1769472,
2744
+ "byteOffset": 31469568
2745
+ }
2746
+ ]
2747
+ },
2748
+ {
2749
+ "dataPath": "params_shard_46.bin",
2750
+ "format": "raw-shard",
2751
+ "nbytes": 30474240,
2752
+ "records": [
2753
+ {
2754
+ "name": "param_224",
2755
+ "shape": [
2756
+ 3072,
2757
+ 384
2758
+ ],
2759
+ "dtype": "uint32",
2760
+ "format": "raw",
2761
+ "nbytes": 4718592,
2762
+ "byteOffset": 0
2763
+ },
2764
+ {
2765
+ "name": "param_225",
2766
+ "shape": [
2767
+ 3072,
2768
+ 96
2769
+ ],
2770
+ "dtype": "float16",
2771
+ "format": "raw",
2772
+ "nbytes": 589824,
2773
+ "byteOffset": 4718592
2774
+ },
2775
+ {
2776
+ "name": "param_226",
2777
+ "shape": [
2778
+ 16384,
2779
+ 384
2780
+ ],
2781
+ "dtype": "uint32",
2782
+ "format": "raw",
2783
+ "nbytes": 25165824,
2784
+ "byteOffset": 5308416
2785
+ }
2786
+ ]
2787
+ },
2788
+ {
2789
+ "dataPath": "params_shard_47.bin",
2790
+ "format": "raw-shard",
2791
+ "nbytes": 33239040,
2792
+ "records": [
2793
+ {
2794
+ "name": "param_227",
2795
+ "shape": [
2796
+ 16384,
2797
+ 96
2798
+ ],
2799
+ "dtype": "float16",
2800
+ "format": "raw",
2801
+ "nbytes": 3145728,
2802
+ "byteOffset": 0
2803
+ },
2804
+ {
2805
+ "name": "param_228",
2806
+ "shape": [
2807
+ 3072,
2808
+ 1024
2809
+ ],
2810
+ "dtype": "uint32",
2811
+ "format": "raw",
2812
+ "nbytes": 12582912,
2813
+ "byteOffset": 3145728
2814
+ },
2815
+ {
2816
+ "name": "param_229",
2817
+ "shape": [
2818
+ 3072,
2819
+ 256
2820
+ ],
2821
+ "dtype": "float16",
2822
+ "format": "raw",
2823
+ "nbytes": 1572864,
2824
+ "byteOffset": 15728640
2825
+ },
2826
+ {
2827
+ "name": "param_230",
2828
+ "shape": [
2829
+ 3072
2830
+ ],
2831
+ "dtype": "float16",
2832
+ "format": "raw",
2833
+ "nbytes": 6144,
2834
+ "byteOffset": 17301504
2835
+ },
2836
+ {
2837
+ "name": "param_231",
2838
+ "shape": [
2839
+ 3072
2840
+ ],
2841
+ "dtype": "float16",
2842
+ "format": "raw",
2843
+ "nbytes": 6144,
2844
+ "byteOffset": 17307648
2845
+ },
2846
+ {
2847
+ "name": "param_232",
2848
+ "shape": [
2849
+ 9216,
2850
+ 384
2851
+ ],
2852
+ "dtype": "uint32",
2853
+ "format": "raw",
2854
+ "nbytes": 14155776,
2855
+ "byteOffset": 17313792
2856
+ },
2857
+ {
2858
+ "name": "param_233",
2859
+ "shape": [
2860
+ 9216,
2861
+ 96
2862
+ ],
2863
+ "dtype": "float16",
2864
+ "format": "raw",
2865
+ "nbytes": 1769472,
2866
+ "byteOffset": 31469568
2867
+ }
2868
+ ]
2869
+ },
2870
+ {
2871
+ "dataPath": "params_shard_48.bin",
2872
+ "format": "raw-shard",
2873
+ "nbytes": 30474240,
2874
+ "records": [
2875
+ {
2876
+ "name": "param_234",
2877
+ "shape": [
2878
+ 3072,
2879
+ 384
2880
+ ],
2881
+ "dtype": "uint32",
2882
+ "format": "raw",
2883
+ "nbytes": 4718592,
2884
+ "byteOffset": 0
2885
+ },
2886
+ {
2887
+ "name": "param_235",
2888
+ "shape": [
2889
+ 3072,
2890
+ 96
2891
+ ],
2892
+ "dtype": "float16",
2893
+ "format": "raw",
2894
+ "nbytes": 589824,
2895
+ "byteOffset": 4718592
2896
+ },
2897
+ {
2898
+ "name": "param_236",
2899
+ "shape": [
2900
+ 16384,
2901
+ 384
2902
+ ],
2903
+ "dtype": "uint32",
2904
+ "format": "raw",
2905
+ "nbytes": 25165824,
2906
+ "byteOffset": 5308416
2907
+ }
2908
+ ]
2909
+ },
2910
+ {
2911
+ "dataPath": "params_shard_49.bin",
2912
+ "format": "raw-shard",
2913
+ "nbytes": 75595776,
2914
+ "records": [
2915
+ {
2916
+ "name": "param_243",
2917
+ "shape": [
2918
+ 49216,
2919
+ 384
2920
+ ],
2921
+ "dtype": "uint32",
2922
+ "format": "raw",
2923
+ "nbytes": 75595776,
2924
+ "byteOffset": 0
2925
+ }
2926
+ ]
2927
+ },
2928
+ {
2929
+ "dataPath": "params_shard_50.bin",
2930
+ "format": "raw-shard",
2931
+ "nbytes": 27817984,
2932
+ "records": [
2933
+ {
2934
+ "name": "param_237",
2935
+ "shape": [
2936
+ 16384,
2937
+ 96
2938
+ ],
2939
+ "dtype": "float16",
2940
+ "format": "raw",
2941
+ "nbytes": 3145728,
2942
+ "byteOffset": 0
2943
+ },
2944
+ {
2945
+ "name": "param_238",
2946
+ "shape": [
2947
+ 3072,
2948
+ 1024
2949
+ ],
2950
+ "dtype": "uint32",
2951
+ "format": "raw",
2952
+ "nbytes": 12582912,
2953
+ "byteOffset": 3145728
2954
+ },
2955
+ {
2956
+ "name": "param_239",
2957
+ "shape": [
2958
+ 3072,
2959
+ 256
2960
+ ],
2961
+ "dtype": "float16",
2962
+ "format": "raw",
2963
+ "nbytes": 1572864,
2964
+ "byteOffset": 15728640
2965
+ },
2966
+ {
2967
+ "name": "param_240",
2968
+ "shape": [
2969
+ 3072
2970
+ ],
2971
+ "dtype": "float16",
2972
+ "format": "raw",
2973
+ "nbytes": 6144,
2974
+ "byteOffset": 17301504
2975
+ },
2976
+ {
2977
+ "name": "param_241",
2978
+ "shape": [
2979
+ 3072
2980
+ ],
2981
+ "dtype": "float16",
2982
+ "format": "raw",
2983
+ "nbytes": 6144,
2984
+ "byteOffset": 17307648
2985
+ },
2986
+ {
2987
+ "name": "param_242",
2988
+ "shape": [
2989
+ 3072
2990
+ ],
2991
+ "dtype": "float16",
2992
+ "format": "raw",
2993
+ "nbytes": 6144,
2994
+ "byteOffset": 17313792
2995
+ },
2996
+ {
2997
+ "name": "param_244",
2998
+ "shape": [
2999
+ 49216,
3000
+ 96
3001
+ ],
3002
+ "dtype": "float16",
3003
+ "format": "raw",
3004
+ "nbytes": 9449472,
3005
+ "byteOffset": 17319936
3006
+ },
3007
+ {
3008
+ "name": "param_245",
3009
+ "shape": [
3010
+ 2048,
3011
+ 128
3012
+ ],
3013
+ "dtype": "float16",
3014
+ "format": "raw",
3015
+ "nbytes": 524288,
3016
+ "byteOffset": 26769408
3017
+ },
3018
+ {
3019
+ "name": "param_246",
3020
+ "shape": [
3021
+ 2048,
3022
+ 128
3023
+ ],
3024
+ "dtype": "float16",
3025
+ "format": "raw",
3026
+ "nbytes": 524288,
3027
+ "byteOffset": 27293696
3028
+ }
3029
+ ]
3030
+ }
3031
+ ]
3032
+ }
params_shard_0.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0701c128cec44f0207a585a805da1c5cf623f8003b70c5b21a9e320319fdd120
3
+ size 75595776
params_shard_1.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:81f252294770ca14ec996ae0aad37c1964e4f56d0472a625f0a6c1d0a815a407
3
+ size 25165824
params_shard_10.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3904df4aa374132c3e9b10f9adfb06d1dbdeb7b936a40bc80156c5278db51b5a
3
+ size 30474240
params_shard_11.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d150bd3a77cf4e69611744175d7e9da4316dbf98e8fdd7bf897fed0fe709c5ca
3
+ size 33239040
params_shard_12.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c54a45b42c749422ae0231be47b95c5645a78857c2a2ffb3a9fe6b0449d7bce
3
+ size 30474240
params_shard_13.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9ca253a3c02fd21238fedb4eec490532b8ad174c34b29db2586dc32b14f61dd
3
+ size 33239040
params_shard_14.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20435e311f6d7a878d9e1f18ee66e74fb2aaf5a6d16fa522c38decf086ed9630
3
+ size 30474240
params_shard_15.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4825361a81f312aea7c559ab5c638881bc144e062f4e2752bf24d023ccd9775a
3
+ size 33239040
params_shard_16.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7debd72cff0b8187a352cffe07bb4f3f595c6b848c6bf95eba64664ad6378d71
3
+ size 30474240
params_shard_17.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84d5728dc172f250f4309ba7c025a5fcb7f29ab960316855a7d8f4e1e89a67b3
3
+ size 33239040
params_shard_18.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a7fa76e97074378f089d95c38af8b8ae11e2371e5f3f2a7f9d4964b3805584ee
3
+ size 30474240
params_shard_19.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cde3909c0cfd53bd59753c2fb93f2069991319b7699ff0f4657e38b9f87696df
3
+ size 33239040
params_shard_2.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f7e20ee52a2a5bbe71297799249f74058c97e9177e59fa5c97fba1227f859b39
3
+ size 30683136
params_shard_20.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:913df00abee1ac6dec3f1e28b0a7c2616d610e526e201dd86a01e693683f610e
3
+ size 30474240
params_shard_21.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e62535914901510015dd62c08da5db44f001f1e31a026f48a50968a5ed1ddbe0
3
+ size 33239040
params_shard_22.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bfb161edf69bbb25a9c96acda8e62a6db97556414c2a8a0e7ce789efbff04c93
3
+ size 30474240
params_shard_23.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cdf8b8690753322780e377dc8e73d6eb73f3f1e43e2c35bf63e1fe89759a1514
3
+ size 33239040
params_shard_24.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e3a5d545ffd3a5e877ce61730dcbbcf0e28484e7ab696c3dbed6a90956571852
3
+ size 30474240
params_shard_25.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca1949c3382e4f4ff55ad2c4cd6dc0afe31b8b28e710306fc432d7d6d77f63d5
3
+ size 33239040
params_shard_26.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63818e9d3290bd9fb27f84b042aff90d4109546d081cc8a0f3b00b8102db46a7
3
+ size 30474240
params_shard_27.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76cfd6212155bb3835775cf4c3b9c8132b02bb7ef37e227a05aeea61b6a1a6f3
3
+ size 33239040
params_shard_28.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aeb3401b9d034133ec6090888fd26089482a05243320703d64dd9519e1cb7df2
3
+ size 30474240
params_shard_29.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17ce177ce0a28dde35384e413bc0368ba48901e826ecb330f65f9cd96af2783a
3
+ size 33239040
params_shard_3.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ba5a0e873bab96c52256c0008d4fe3dcc35bdb680d85a103ca195a60e93a018
3
+ size 33239040
params_shard_30.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5b41b14f6bbd8a6347a1259af27084da9f705fcde6cac4254668d57ec35e744
3
+ size 30474240
params_shard_31.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2fe60b40d54fdfb6e43bbec5acb76874f100f7a79fd43aa7171f8d5f7673f76b
3
+ size 33239040
params_shard_32.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac1eed8f73e3105b38cefcf50c5729932dc6693a6587451c2561c5212717b12a
3
+ size 30474240
params_shard_33.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4974c696564f01734d1b7db8fa959ee8ee25e2c23fa10181461de449214db451
3
+ size 33239040
params_shard_34.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7827821cacb6124302433768ffcdb967d43d53e11201fde74e3fdeb46e2c5c32
3
+ size 30474240
params_shard_35.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f446b332a3db650ae1f0319c34f40d22b17df061815de8f8a63dcf3e84f24770
3
+ size 33239040
params_shard_36.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94a4b5993dd24d9731d60470ca6860d149de21cf544337ac8c8c80b290675f6a
3
+ size 30474240
params_shard_37.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:45f4fd83e14e314e1966a9f518832fcf6c1022bb57026a32a4effacbbee3e6aa
3
+ size 33239040
params_shard_38.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:128f84718b1a03062178e963783efdbf2bfcbb1ac9d0ec1cb297f4d8f3956b71
3
+ size 30474240
params_shard_39.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd9d0a4bdc8abe33b9ec54c0fc06864eb22261638240dfa8651b44cc20921d19
3
+ size 33239040
params_shard_4.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72488f871ce244e4082f7c6d65c57653eb9326f6fe4059ad25ee0d27ff969366
3
+ size 30474240
params_shard_40.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1df3962727f92ef621a0480b433f13bfce8e91946b4febb8c6bc71e3ac05e65
3
+ size 30474240
params_shard_41.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3646bf0ca3017a54cfe11a2f2df8e676abb27f99ea82789b878937bc4f8b71c
3
+ size 33239040
params_shard_42.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:202f8fb79464b5aa82f9f71ad8003bc1a38c4138dc061faa681d1651e2223f04
3
+ size 30474240
params_shard_43.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b24de508f85710630ba6025d6c17dceab52c2a52dfa31f786f01f91133304d18
3
+ size 33239040
params_shard_44.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44188936a4635804fc5011d804584e23fe4a65d8697dba5a324d4cce03b11aa2
3
+ size 30474240
params_shard_45.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:02e8c89aca450a2aa50f7704129f0c661634a92c25b492110b987edbe3a1f82b
3
+ size 33239040
params_shard_46.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37f0a9fa16b244b1b7db58c75eda5f1acd457c224bb6ff8a0d0ee51b1eedb7a5
3
+ size 30474240
params_shard_47.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:481c94e132991dded5b20dec8d4f51a396872527d72d2d1e88c31897efa9962b
3
+ size 33239040
params_shard_48.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c930c0dffc1d1a3afc7d330ef511be663cef7d3edc9cc0e37e489620d9a69fe
3
+ size 30474240
params_shard_49.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:64fe947deae9a24485fe51658e2b89333bae3477eddf35f6e1dd21ddac2f095f
3
+ size 75595776
params_shard_5.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c85c07cd4ddfcb64f0bd29391998f1391dad87ee2eeb1ceab624b1dead9e87d
3
+ size 33239040
params_shard_50.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1cfc582187270d56d8623f242631b14546df6f81f10d2f262723504203e27cca
3
+ size 27817984