tachyphylaxis commited on
Commit
5e4e733
1 Parent(s): b086baf

Upload folder using huggingface_hub

Browse files
Files changed (43) hide show
  1. README.md +113 -0
  2. config.json +24 -0
  3. model-00001-of-00036.safetensors +3 -0
  4. model-00002-of-00036.safetensors +3 -0
  5. model-00003-of-00036.safetensors +3 -0
  6. model-00004-of-00036.safetensors +3 -0
  7. model-00005-of-00036.safetensors +3 -0
  8. model-00006-of-00036.safetensors +3 -0
  9. model-00007-of-00036.safetensors +3 -0
  10. model-00008-of-00036.safetensors +3 -0
  11. model-00009-of-00036.safetensors +3 -0
  12. model-00010-of-00036.safetensors +3 -0
  13. model-00011-of-00036.safetensors +3 -0
  14. model-00012-of-00036.safetensors +3 -0
  15. model-00013-of-00036.safetensors +3 -0
  16. model-00014-of-00036.safetensors +3 -0
  17. model-00015-of-00036.safetensors +3 -0
  18. model-00016-of-00036.safetensors +3 -0
  19. model-00017-of-00036.safetensors +3 -0
  20. model-00018-of-00036.safetensors +3 -0
  21. model-00019-of-00036.safetensors +3 -0
  22. model-00020-of-00036.safetensors +3 -0
  23. model-00021-of-00036.safetensors +3 -0
  24. model-00022-of-00036.safetensors +3 -0
  25. model-00023-of-00036.safetensors +3 -0
  26. model-00024-of-00036.safetensors +3 -0
  27. model-00025-of-00036.safetensors +3 -0
  28. model-00026-of-00036.safetensors +3 -0
  29. model-00027-of-00036.safetensors +3 -0
  30. model-00028-of-00036.safetensors +3 -0
  31. model-00029-of-00036.safetensors +3 -0
  32. model-00030-of-00036.safetensors +3 -0
  33. model-00031-of-00036.safetensors +3 -0
  34. model-00032-of-00036.safetensors +3 -0
  35. model-00033-of-00036.safetensors +3 -0
  36. model-00034-of-00036.safetensors +3 -0
  37. model-00035-of-00036.safetensors +3 -0
  38. model-00036-of-00036.safetensors +3 -0
  39. model.safetensors.index.json +1 -0
  40. special_tokens_map.json +30 -0
  41. tokenizer.json +0 -0
  42. tokenizer.model +3 -0
  43. tokenizer_config.json +43 -0
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - lemonilia/LimaRP
5
+ - PygmalionAI/PIPPA
6
+ language:
7
+ - en
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - roleplay
11
+ - not-for-all-audiences
12
+ ---
13
+
14
+ This is TriadParty/deepsex-34b with tensors renamed to confirm with the standard Llama model and the standard Llama tokenizer, thanks to chargoddard/Yi-34B-Llama.
15
+
16
+ Original model card:
17
+
18
+ **Deepsex-34b**
19
+
20
+ tks [TheBloke](https://huggingface.co/TheBloke) making quantized version!
21
+ gguf:https://huggingface.co/TheBloke/deepsex-34b-GGUF
22
+ exl2:https://huggingface.co/waldie/deepsex-34b-4bpw-h6-exl2
23
+ awq:https://huggingface.co/TheBloke/deepsex-34b-AWQ
24
+ 6b base version:https://huggingface.co/TriadParty/deepsex-6b-base
25
+ 6b chat version:https://huggingface.co/TriadParty/deepsex-6b-chat
26
+
27
+ In fact, I plan to make a model of the "Seven Deadly Sins" series. Of course, the pre-training data used in these models are all human-produced data. I think the big model is like a mirror, reflecting the human itself. Examine yourself may become a crucial step in realizing agi.
28
+ So, It is 'lust'.
29
+ The 6b corresponding to the model is being produced, and the corresponding llama version is also being produced. The classification data of the other six deadly sins is being collected. Welcome to provide inspiration!
30
+
31
+ Here are the steps to make this model:
32
+ 1. I first collected a total collection of about 4GB of various light novels, and used BERT to perform two rounds of similarity deduplication on the novels with similar plots in the data set. In addition, a portion of nsfw novels are mixed in to improve the NSFW capabilities of the model.
33
+ 2. Then use the YI-34B-base as the base of the model, use the setting of r=64 alpha=128 and use qlora to fine-tune 3 epochs for continuous pre-training.
34
+ 3. Prepare the limarp+pippa data set, clean it into alpaca format, and use [goliath-120b](https://huggingface.co/alpindale/goliath-120b), which is good at role-playing, to score each question and answer pair, and filter out the high-quality ones. 30k data.
35
+ 4. Use the data in 3 for sft on the base model obtained in 2, 6 epochs, r=16 alpha=32 for fine-tuning.
36
+
37
+ *Format*
38
+
39
+ alpaca
40
+
41
+ ```
42
+ [
43
+ {
44
+ "instruction": "user instruction (required)",
45
+ "input": "user input (optional)",
46
+ "output": "model response (required)",
47
+ "history": [
48
+ ["user instruction in the first round (optional)", "model response in the first round (optional)"],
49
+ ["user instruction in the second round (optional)", "model response in the second round (optional)"]
50
+ ]
51
+ }
52
+ ]
53
+ ```
54
+
55
+ *Effect*:
56
+ Proficient in role-playing skills, while being highly accepted on NSFW, pure love words will appear from time to time. like:
57
+ ```#3
58
+ Sweaty old man
59
+ December 5, 2023 2:03 PM
60
+ Fxxk, you are such a xxx!
61
+
62
+ #4
63
+ 27.3s
64
+ Mirai
65
+ December 5, 2023 2:03 PM
66
+ "Of course I do! I can't break promises, Sweaty old man. We have been together since we were kids. We are both best friends and lovers to end all iteration." I smiled with affection. It was clear that I meant everything I said. "We both know that you like taking command of us like this. Am I not your squirting toy, Sweaty old man?" I asked with a cute pout. "We should meet up in front of the shop after classes. I'll see you there. See you, Sweaty old man!"
67
+ ```
68
+ It feels like it’s still worth a try~
69
+
70
+ Support me [here](https://ko-fi.com/mikolisa) :)
71
+
72
+
73
+ **Deepsex-34b**
74
+
75
+ 事实上,我有计划制作一个“七宗罪”系列模型。当然,这些模型所使用的预训练数据全部为人类产出的数据。我认为大模型就像一面镜子,倒映的是人类自身。审视自己或许会成为实现agi中至关重要的一步。
76
+ 该模型对应的6b正在制作,相应的llama版本的也在制作。其他六宗罪的分类数据正在收集中,欢迎大家提供灵感!
77
+
78
+ *步骤*
79
+ 1. 我先收集了各种轻小说大约4GB的总集,通过bert对该数据集中剧情比较相似的小说进行了两轮相似度去重。另外混入了一部分nsfw小说以提高该模型的NSFW能力。
80
+ 2. 然后将该模型以YI-34B-base为基座,使用r=64 alpha=128 的设置使用qlora微调了3个epochs来进行持续预训练。
81
+ 3. 准备limarp+pippa数据集,统一清洗为alpaca格式,并且使用比较擅长角色扮演的[goliath-120b](https://huggingface.co/alpindale/goliath-120b)对每个问答对进行打分,筛选出其中质量高的大约30k数据。
82
+ 4. 对2中得到的base模型使用3中的数据进行sft,6个epochs,r=16 alpha=32进行微调。
83
+
84
+ *格式*
85
+
86
+ alpaca
87
+ ```[
88
+ {
89
+ "instruction": "user instruction (required)",
90
+ "input": "user input (optional)",
91
+ "output": "model response (required)",
92
+ "history": [
93
+ ["user instruction in the first round (optional)", "model response in the first round (optional)"],
94
+ ["user instruction in the second round (optional)", "model response in the second round (optional)"]
95
+ ]
96
+ }
97
+ ]```
98
+
99
+ *效果*
100
+ 熟练的角色扮演技能,在NSFW上有很高接受度的同时,会时不时的出现纯爱的话语。如:
101
+ ```#3
102
+ Sweaty old man
103
+ December 5, 2023 2:03 PM
104
+ Fxxk, you are such a xxx!
105
+
106
+ #4
107
+ 27.3s
108
+ Mirai
109
+ December 5, 2023 2:03 PM
110
+ "Of course I do! I can't break promises, Sweaty old man. We have been together since we were kids. We are both best friends and lovers to end all iteration." I smiled with affection. It was clear that I meant everything I said. "We both know that you like taking command of us like this. Am I not your squirting toy, Sweaty old man?" I asked with a cute pout. "We should meet up in front of the shop after classes. I'll see you there. See you, Sweaty old man!"
111
+ ```
112
+ 感觉还是很值得一试的~
113
+ 如果觉得好用,欢迎支持我一杯 [咖啡](https://ko-fi.com/mikolisa) :)
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "bos_token_id": 1,
6
+ "eos_token_id": 2,
7
+ "hidden_act": "silu",
8
+ "hidden_size": 7168,
9
+ "initializer_range": 0.02,
10
+ "intermediate_size": 20480,
11
+ "max_position_embeddings": 4096,
12
+ "model_type": "llama",
13
+ "num_attention_heads": 56,
14
+ "num_hidden_layers": 60,
15
+ "num_key_value_heads": 8,
16
+ "pad_token_id": 0,
17
+ "rms_norm_eps": 1e-05,
18
+ "rope_theta": 5000000.0,
19
+ "tie_word_embeddings": false,
20
+ "torch_dtype": "bfloat16",
21
+ "transformers_version": "4.34.0",
22
+ "use_cache": true,
23
+ "vocab_size": 64000
24
+ }
model-00001-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d5055918493c5d4d1494af90923d9bbf3511bf98eaac2bce1cbf52efdb4bbc6
3
+ size 1739588416
model-00002-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c4ad38f8ffb1b535f4af2bd5e56cdeba39fd19519f19f63912b5522a416bb1d
3
+ size 1937827760
model-00003-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:491799959f11f995a29728384c3f2f478eee4e721a2774db8055be59f19d6129
3
+ size 1937827760
model-00004-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9e27b1fcb26cdb4d6ef8cefc5777cada99c114972857fbbd30562572e10e413
3
+ size 1996547664
model-00005-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39e6c9e947fbc956a5a4aa4f87e4c8fce749155a6b6d6b523fd1ebc5ec2fa775
3
+ size 1937798856
model-00006-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:449b1aadfe6f7ba33b0880f9fea407090a0cd21016ff5ea33cec14268fab26db
3
+ size 1937827760
model-00007-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a3d5b79ecec115ca8a239823d927f63d31d448e142ad97340951527be5a02b3e
3
+ size 1937827768
model-00008-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e75982e6989bc7b5084d4f4c2f3d68d70f3d29e616ab4f29fa10a2296d31fff9
3
+ size 1996547680
model-00009-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15508b0118519b5604078b1c3e8e03f7cd9301891af2435fec5886739690a186
3
+ size 1937798872
model-00010-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4febaf994abd7dc44faab650732d458693f7e9ed9fd946cc3e211614ddadb031
3
+ size 1937827776
model-00011-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e025e79c7c8f609c822e5cfb56544cff1a5edf265f9f1d70fe50b3b8a26dc49e
3
+ size 1937827776
model-00012-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f62cc690273d75664149268c9a1214d869e7448e2409e2c8a4d2f2f51233d3d
3
+ size 1996547680
model-00013-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:61231f56ac3b5c69c2c13dc096f9c5ad1eef52da5ebd2dfb77d22d127926145d
3
+ size 1937798872
model-00014-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:68f3137cd9d67cce714bf5faa427327b5595db10012879d17e2554d27f8bd936
3
+ size 1937827776
model-00015-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:939d938613df2c500b3abfa7ae4d0691a3d2864670131cfa8796b49b7358a0be
3
+ size 1937827776
model-00016-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:62afec142b9d2910e56d12b3edd6cbb1afbf4a9a940adc4ecfffe8fdbc7aed2a
3
+ size 1996547680
model-00017-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db1e6f3f433fe8abd67dcf410d201186e9d436c59a6c4c50e5de8bcb00bd0dd7
3
+ size 1937798872
model-00018-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c251c2c54a2b085a6415a1b3a5893c940046aa352edc8b0250d75e7e919154c4
3
+ size 1937827776
model-00019-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6154ea3a94816071b6c4b6d4dd205f731954b0017723b03908e836b55ebd592
3
+ size 1937827776
model-00020-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fab507f5bc4df992c3d035176026b9f91a9f94d5f8de5b498445673c4d261727
3
+ size 1996547680
model-00021-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44a7234c0b720068113f7b4aa04f75724ddfcb51df02e1466cf3c03321fd9ffd
3
+ size 1937798872
model-00022-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9d36367fa600bca96643d0717b79072cd8ec50e1e7fff3a437156e59709f564
3
+ size 1937827776
model-00023-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff0a4bf82e4015dda2ccfa24fb9dbf89acd4898a2aac2adf18db66236f87111e
3
+ size 1937827776
model-00024-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1cf83ac0b8899c5fb4de1f1371352bff3b5954b217dfd7024e427b228992f50d
3
+ size 1996547680
model-00025-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8578d1b3e89f14a63f49e7636b4ded996da42f5f6360780c0c078670dcaf013
3
+ size 1937798872
model-00026-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe90d33aa35dbcf4697b9b5cf6028dd410440514a0850b7c392a300c030da4fd
3
+ size 1937827776
model-00027-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6856ceaf7b066a60fa6a27d1dd99735b7d6c75947a479dae706d468ea5fd1dc
3
+ size 1937827776
model-00028-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06c21e848a183a8853ba89d5ef197bb899e9e197856be07939f10c7b134d4bfa
3
+ size 1996547680
model-00029-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8239edf29039d5b1be0b22b846f7e472a2e83e92469742533d853e84ce27d40b
3
+ size 1937798872
model-00030-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7357657e4ccef5f97b919def72a568f8023c3b5a4808ed444a975fcf9c2c4940
3
+ size 1937827776
model-00031-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9286b2c224fbb13613e0649bd22bbc4a76ec11441221db7acf5714b0e1a9c17
3
+ size 1937827776
model-00032-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0faa11c44f81101788f3373128de5aeeed0f0c2ff04f271ac5bd39b0cb8adb4
3
+ size 1996547680
model-00033-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b078c8e9ce48da04785ea66f55d9ccc7f5847d6ed2dc36aea7a841b123bcab4
3
+ size 1937798872
model-00034-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3aeec6bb405ea0c3778270e6abc83946b6a4f03560ecaee904239f0c4acd0e7a
3
+ size 1937827776
model-00035-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4fbb686f7cd15d372333478313b3efc53d251cc5e04e86b96c510e9c8a4969ac
3
+ size 1702960712
model-00036-of-00036.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00605ef7805b07c6b8d5a1071336b6c35bfb5848ea2e55e7d34dbe78a133e59b
3
+ size 917504128
model.safetensors.index.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"metadata": {"total_size": 68777834496}, "weight_map": {"lm_head.weight": "model-00036-of-00036.safetensors", "model.embed_tokens.weight": "model-00001-of-00036.safetensors", "model.layers.0.input_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00002-of-00036.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00036.safetensors", "model.layers.1.input_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00002-of-00036.safetensors", "model.layers.10.input_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.input_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00008-of-00036.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00008-of-00036.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00008-of-00036.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00007-of-00036.safetensors", "model.layers.12.input_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00008-of-00036.safetensors", "model.layers.13.input_layernorm.weight": "model-00009-of-00036.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00009-of-00036.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.input_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00010-of-00036.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00009-of-00036.safetensors", "model.layers.15.input_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.input_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00011-of-00036.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00011-of-00036.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00010-of-00036.safetensors", "model.layers.17.input_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.input_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00012-of-00036.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00012-of-00036.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00012-of-00036.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00011-of-00036.safetensors", "model.layers.19.input_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00012-of-00036.safetensors", "model.layers.2.input_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00003-of-00036.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00003-of-00036.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00036.safetensors", "model.layers.20.input_layernorm.weight": "model-00013-of-00036.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00013-of-00036.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.input_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00014-of-00036.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00013-of-00036.safetensors", "model.layers.22.input_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.input_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00015-of-00036.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00015-of-00036.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00014-of-00036.safetensors", "model.layers.24.input_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.input_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00016-of-00036.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00016-of-00036.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00016-of-00036.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00015-of-00036.safetensors", "model.layers.26.input_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00016-of-00036.safetensors", "model.layers.27.input_layernorm.weight": "model-00017-of-00036.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00017-of-00036.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.input_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00018-of-00036.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00017-of-00036.safetensors", "model.layers.29.input_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00018-of-00036.safetensors", "model.layers.3.input_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00003-of-00036.safetensors", "model.layers.30.input_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00019-of-00036.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00019-of-00036.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00018-of-00036.safetensors", "model.layers.31.input_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.input_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00020-of-00036.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00020-of-00036.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00020-of-00036.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00019-of-00036.safetensors", "model.layers.33.input_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00020-of-00036.safetensors", "model.layers.34.input_layernorm.weight": "model-00021-of-00036.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00021-of-00036.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.o_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.input_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00022-of-00036.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00021-of-00036.safetensors", "model.layers.36.input_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.36.post_attention_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.36.mlp.down_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.mlp.gate_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.mlp.up_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.k_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.o_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.q_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.v_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.input_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.37.post_attention_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.37.mlp.down_proj.weight": "model-00023-of-00036.safetensors", "model.layers.37.mlp.gate_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.mlp.up_proj.weight": "model-00023-of-00036.safetensors", "model.layers.37.self_attn.k_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.self_attn.o_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.self_attn.q_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.self_attn.v_proj.weight": "model-00022-of-00036.safetensors", "model.layers.38.input_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.38.post_attention_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.38.mlp.down_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.mlp.gate_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.mlp.up_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.k_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.o_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.q_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.v_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.input_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.39.post_attention_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.39.mlp.down_proj.weight": "model-00024-of-00036.safetensors", "model.layers.39.mlp.gate_proj.weight": "model-00024-of-00036.safetensors", "model.layers.39.mlp.up_proj.weight": "model-00024-of-00036.safetensors", "model.layers.39.self_attn.k_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.self_attn.o_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.self_attn.q_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.self_attn.v_proj.weight": "model-00023-of-00036.safetensors", "model.layers.4.input_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00004-of-00036.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00004-of-00036.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00004-of-00036.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00003-of-00036.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00003-of-00036.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00003-of-00036.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00003-of-00036.safetensors", "model.layers.40.input_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.40.post_attention_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.40.mlp.down_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.mlp.gate_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.mlp.up_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.k_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.o_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.q_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.v_proj.weight": "model-00024-of-00036.safetensors", "model.layers.41.input_layernorm.weight": "model-00025-of-00036.safetensors", "model.layers.41.post_attention_layernorm.weight": "model-00025-of-00036.safetensors", "model.layers.41.mlp.down_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.mlp.gate_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.mlp.up_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.k_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.o_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.q_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.v_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.input_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.42.post_attention_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.42.mlp.down_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.mlp.gate_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.mlp.up_proj.weight": "model-00026-of-00036.safetensors", "model.layers.42.self_attn.k_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.self_attn.o_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.self_attn.q_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.self_attn.v_proj.weight": "model-00025-of-00036.safetensors", "model.layers.43.input_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.43.post_attention_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.43.mlp.down_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.mlp.gate_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.mlp.up_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.k_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.o_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.q_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.v_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.input_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.44.post_attention_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.44.mlp.down_proj.weight": "model-00027-of-00036.safetensors", "model.layers.44.mlp.gate_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.mlp.up_proj.weight": "model-00027-of-00036.safetensors", "model.layers.44.self_attn.k_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.self_attn.o_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.self_attn.q_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.self_attn.v_proj.weight": "model-00026-of-00036.safetensors", "model.layers.45.input_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.45.post_attention_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.45.mlp.down_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.mlp.gate_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.mlp.up_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.k_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.o_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.q_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.v_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.input_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.46.post_attention_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.46.mlp.down_proj.weight": "model-00028-of-00036.safetensors", "model.layers.46.mlp.gate_proj.weight": "model-00028-of-00036.safetensors", "model.layers.46.mlp.up_proj.weight": "model-00028-of-00036.safetensors", "model.layers.46.self_attn.k_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.self_attn.o_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.self_attn.q_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.self_attn.v_proj.weight": "model-00027-of-00036.safetensors", "model.layers.47.input_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.47.post_attention_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.47.mlp.down_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.mlp.gate_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.mlp.up_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.k_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.o_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.q_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.v_proj.weight": "model-00028-of-00036.safetensors", "model.layers.48.input_layernorm.weight": "model-00029-of-00036.safetensors", "model.layers.48.post_attention_layernorm.weight": "model-00029-of-00036.safetensors", "model.layers.48.mlp.down_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.mlp.gate_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.mlp.up_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.k_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.o_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.q_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.v_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.input_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.49.post_attention_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.49.mlp.down_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.mlp.gate_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.mlp.up_proj.weight": "model-00030-of-00036.safetensors", "model.layers.49.self_attn.k_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.self_attn.o_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.self_attn.q_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.self_attn.v_proj.weight": "model-00029-of-00036.safetensors", "model.layers.5.input_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00004-of-00036.safetensors", "model.layers.50.input_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.50.post_attention_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.50.mlp.down_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.mlp.gate_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.mlp.up_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.k_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.o_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.q_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.v_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.input_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.51.post_attention_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.51.mlp.down_proj.weight": "model-00031-of-00036.safetensors", "model.layers.51.mlp.gate_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.mlp.up_proj.weight": "model-00031-of-00036.safetensors", "model.layers.51.self_attn.k_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.self_attn.o_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.self_attn.q_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.self_attn.v_proj.weight": "model-00030-of-00036.safetensors", "model.layers.52.input_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.52.post_attention_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.52.mlp.down_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.mlp.gate_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.mlp.up_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.k_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.o_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.q_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.v_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.input_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.53.post_attention_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.53.mlp.down_proj.weight": "model-00032-of-00036.safetensors", "model.layers.53.mlp.gate_proj.weight": "model-00032-of-00036.safetensors", "model.layers.53.mlp.up_proj.weight": "model-00032-of-00036.safetensors", "model.layers.53.self_attn.k_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.self_attn.o_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.self_attn.q_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.self_attn.v_proj.weight": "model-00031-of-00036.safetensors", "model.layers.54.input_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.54.post_attention_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.54.mlp.down_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.mlp.gate_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.mlp.up_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.k_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.o_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.q_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.v_proj.weight": "model-00032-of-00036.safetensors", "model.layers.55.input_layernorm.weight": "model-00033-of-00036.safetensors", "model.layers.55.post_attention_layernorm.weight": "model-00033-of-00036.safetensors", "model.layers.55.mlp.down_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.mlp.gate_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.mlp.up_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.k_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.o_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.q_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.v_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.input_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.56.post_attention_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.56.mlp.down_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.mlp.gate_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.mlp.up_proj.weight": "model-00034-of-00036.safetensors", "model.layers.56.self_attn.k_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.self_attn.o_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.self_attn.q_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.self_attn.v_proj.weight": "model-00033-of-00036.safetensors", "model.layers.57.input_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.57.post_attention_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.57.mlp.down_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.mlp.gate_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.mlp.up_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.k_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.o_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.q_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.v_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.input_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.58.post_attention_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.58.mlp.down_proj.weight": "model-00035-of-00036.safetensors", "model.layers.58.mlp.gate_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.mlp.up_proj.weight": "model-00035-of-00036.safetensors", "model.layers.58.self_attn.k_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.self_attn.o_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.self_attn.q_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.self_attn.v_proj.weight": "model-00034-of-00036.safetensors", "model.layers.59.input_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.59.post_attention_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.59.mlp.down_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.mlp.gate_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.mlp.up_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.k_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.o_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.q_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.v_proj.weight": "model-00035-of-00036.safetensors", "model.layers.6.input_layernorm.weight": "model-00005-of-00036.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00005-of-00036.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.input_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00006-of-00036.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00005-of-00036.safetensors", "model.layers.8.input_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.input_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00007-of-00036.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00007-of-00036.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00006-of-00036.safetensors", "model.norm.weight": "model-00035-of-00036.safetensors"}}
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|startoftext|>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|endoftext|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:386c49cf943d71aa110361135338c50e38beeff0a66593480421f37b319e1a39
3
+ size 1033105
tokenizer_config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<|startoftext|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "<|endoftext|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ }
29
+ },
30
+ "bos_token": "<|startoftext|>",
31
+ "clean_up_tokenization_spaces": false,
32
+ "eos_token": "<|endoftext|>",
33
+ "legacy": false,
34
+ "model_max_length": 4096,
35
+ "pad_token": "<unk>",
36
+ "padding_side": "right",
37
+ "sp_model_kwargs": {},
38
+ "spaces_between_special_tokens": false,
39
+ "tokenizer_class": "LlamaTokenizer",
40
+ "truncation_side": "right",
41
+ "unk_token": "<unk>",
42
+ "use_default_system_prompt": false
43
+ }