tachyphylaxis
commited on
Upload folder using huggingface_hub
Browse files- README.md +113 -0
- config.json +24 -0
- model-00001-of-00036.safetensors +3 -0
- model-00002-of-00036.safetensors +3 -0
- model-00003-of-00036.safetensors +3 -0
- model-00004-of-00036.safetensors +3 -0
- model-00005-of-00036.safetensors +3 -0
- model-00006-of-00036.safetensors +3 -0
- model-00007-of-00036.safetensors +3 -0
- model-00008-of-00036.safetensors +3 -0
- model-00009-of-00036.safetensors +3 -0
- model-00010-of-00036.safetensors +3 -0
- model-00011-of-00036.safetensors +3 -0
- model-00012-of-00036.safetensors +3 -0
- model-00013-of-00036.safetensors +3 -0
- model-00014-of-00036.safetensors +3 -0
- model-00015-of-00036.safetensors +3 -0
- model-00016-of-00036.safetensors +3 -0
- model-00017-of-00036.safetensors +3 -0
- model-00018-of-00036.safetensors +3 -0
- model-00019-of-00036.safetensors +3 -0
- model-00020-of-00036.safetensors +3 -0
- model-00021-of-00036.safetensors +3 -0
- model-00022-of-00036.safetensors +3 -0
- model-00023-of-00036.safetensors +3 -0
- model-00024-of-00036.safetensors +3 -0
- model-00025-of-00036.safetensors +3 -0
- model-00026-of-00036.safetensors +3 -0
- model-00027-of-00036.safetensors +3 -0
- model-00028-of-00036.safetensors +3 -0
- model-00029-of-00036.safetensors +3 -0
- model-00030-of-00036.safetensors +3 -0
- model-00031-of-00036.safetensors +3 -0
- model-00032-of-00036.safetensors +3 -0
- model-00033-of-00036.safetensors +3 -0
- model-00034-of-00036.safetensors +3 -0
- model-00035-of-00036.safetensors +3 -0
- model-00036-of-00036.safetensors +3 -0
- model.safetensors.index.json +1 -0
- special_tokens_map.json +30 -0
- tokenizer.json +0 -0
- tokenizer.model +3 -0
- tokenizer_config.json +43 -0
README.md
ADDED
@@ -0,0 +1,113 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- lemonilia/LimaRP
|
5 |
+
- PygmalionAI/PIPPA
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
pipeline_tag: text-generation
|
9 |
+
tags:
|
10 |
+
- roleplay
|
11 |
+
- not-for-all-audiences
|
12 |
+
---
|
13 |
+
|
14 |
+
This is TriadParty/deepsex-34b with tensors renamed to confirm with the standard Llama model and the standard Llama tokenizer, thanks to chargoddard/Yi-34B-Llama.
|
15 |
+
|
16 |
+
Original model card:
|
17 |
+
|
18 |
+
**Deepsex-34b**
|
19 |
+
|
20 |
+
tks [TheBloke](https://huggingface.co/TheBloke) making quantized version!
|
21 |
+
gguf:https://huggingface.co/TheBloke/deepsex-34b-GGUF
|
22 |
+
exl2:https://huggingface.co/waldie/deepsex-34b-4bpw-h6-exl2
|
23 |
+
awq:https://huggingface.co/TheBloke/deepsex-34b-AWQ
|
24 |
+
6b base version:https://huggingface.co/TriadParty/deepsex-6b-base
|
25 |
+
6b chat version:https://huggingface.co/TriadParty/deepsex-6b-chat
|
26 |
+
|
27 |
+
In fact, I plan to make a model of the "Seven Deadly Sins" series. Of course, the pre-training data used in these models are all human-produced data. I think the big model is like a mirror, reflecting the human itself. Examine yourself may become a crucial step in realizing agi.
|
28 |
+
So, It is 'lust'.
|
29 |
+
The 6b corresponding to the model is being produced, and the corresponding llama version is also being produced. The classification data of the other six deadly sins is being collected. Welcome to provide inspiration!
|
30 |
+
|
31 |
+
Here are the steps to make this model:
|
32 |
+
1. I first collected a total collection of about 4GB of various light novels, and used BERT to perform two rounds of similarity deduplication on the novels with similar plots in the data set. In addition, a portion of nsfw novels are mixed in to improve the NSFW capabilities of the model.
|
33 |
+
2. Then use the YI-34B-base as the base of the model, use the setting of r=64 alpha=128 and use qlora to fine-tune 3 epochs for continuous pre-training.
|
34 |
+
3. Prepare the limarp+pippa data set, clean it into alpaca format, and use [goliath-120b](https://huggingface.co/alpindale/goliath-120b), which is good at role-playing, to score each question and answer pair, and filter out the high-quality ones. 30k data.
|
35 |
+
4. Use the data in 3 for sft on the base model obtained in 2, 6 epochs, r=16 alpha=32 for fine-tuning.
|
36 |
+
|
37 |
+
*Format*
|
38 |
+
|
39 |
+
alpaca
|
40 |
+
|
41 |
+
```
|
42 |
+
[
|
43 |
+
{
|
44 |
+
"instruction": "user instruction (required)",
|
45 |
+
"input": "user input (optional)",
|
46 |
+
"output": "model response (required)",
|
47 |
+
"history": [
|
48 |
+
["user instruction in the first round (optional)", "model response in the first round (optional)"],
|
49 |
+
["user instruction in the second round (optional)", "model response in the second round (optional)"]
|
50 |
+
]
|
51 |
+
}
|
52 |
+
]
|
53 |
+
```
|
54 |
+
|
55 |
+
*Effect*:
|
56 |
+
Proficient in role-playing skills, while being highly accepted on NSFW, pure love words will appear from time to time. like:
|
57 |
+
```#3
|
58 |
+
Sweaty old man
|
59 |
+
December 5, 2023 2:03 PM
|
60 |
+
Fxxk, you are such a xxx!
|
61 |
+
|
62 |
+
#4
|
63 |
+
27.3s
|
64 |
+
Mirai
|
65 |
+
December 5, 2023 2:03 PM
|
66 |
+
"Of course I do! I can't break promises, Sweaty old man. We have been together since we were kids. We are both best friends and lovers to end all iteration." I smiled with affection. It was clear that I meant everything I said. "We both know that you like taking command of us like this. Am I not your squirting toy, Sweaty old man?" I asked with a cute pout. "We should meet up in front of the shop after classes. I'll see you there. See you, Sweaty old man!"
|
67 |
+
```
|
68 |
+
It feels like it’s still worth a try~
|
69 |
+
|
70 |
+
Support me [here](https://ko-fi.com/mikolisa) :)
|
71 |
+
|
72 |
+
|
73 |
+
**Deepsex-34b**
|
74 |
+
|
75 |
+
事实上,我有计划制作一个“七宗罪”系列模型。当然,这些模型所使用的预训练数据全部为人类产出的数据。我认为大模型就像一面镜子,倒映的是人类自身。审视自己或许会成为实现agi中至关重要的一步。
|
76 |
+
该模型对应的6b正在制作,相应的llama版本的也在制作。其他六宗罪的分类数据正在收集中,欢迎大家提供灵感!
|
77 |
+
|
78 |
+
*步骤*
|
79 |
+
1. 我先收集了各种轻小说大约4GB的总集,通过bert对该数据集中剧情比较相似的小说进行了两轮相似度去重。另外混入了一部分nsfw小说以提高该模型的NSFW能力。
|
80 |
+
2. 然后将该模型以YI-34B-base为基座,使用r=64 alpha=128 的设置使用qlora微调了3个epochs来进行持续预训练。
|
81 |
+
3. 准备limarp+pippa数据集,统一清洗为alpaca格式,并且使用比较擅长角色扮演的[goliath-120b](https://huggingface.co/alpindale/goliath-120b)对每个问答对进行打分,筛选出其中质量高的大约30k数据。
|
82 |
+
4. 对2中得到的base模型使用3中的数据进行sft,6个epochs,r=16 alpha=32进行微调。
|
83 |
+
|
84 |
+
*格式*
|
85 |
+
|
86 |
+
alpaca
|
87 |
+
```[
|
88 |
+
{
|
89 |
+
"instruction": "user instruction (required)",
|
90 |
+
"input": "user input (optional)",
|
91 |
+
"output": "model response (required)",
|
92 |
+
"history": [
|
93 |
+
["user instruction in the first round (optional)", "model response in the first round (optional)"],
|
94 |
+
["user instruction in the second round (optional)", "model response in the second round (optional)"]
|
95 |
+
]
|
96 |
+
}
|
97 |
+
]```
|
98 |
+
|
99 |
+
*效果*
|
100 |
+
熟练的角色扮演技能,在NSFW上有很高接受度的同时,会时不时的出现纯爱的话语。如:
|
101 |
+
```#3
|
102 |
+
Sweaty old man
|
103 |
+
December 5, 2023 2:03 PM
|
104 |
+
Fxxk, you are such a xxx!
|
105 |
+
|
106 |
+
#4
|
107 |
+
27.3s
|
108 |
+
Mirai
|
109 |
+
December 5, 2023 2:03 PM
|
110 |
+
"Of course I do! I can't break promises, Sweaty old man. We have been together since we were kids. We are both best friends and lovers to end all iteration." I smiled with affection. It was clear that I meant everything I said. "We both know that you like taking command of us like this. Am I not your squirting toy, Sweaty old man?" I asked with a cute pout. "We should meet up in front of the shop after classes. I'll see you there. See you, Sweaty old man!"
|
111 |
+
```
|
112 |
+
感觉还是很值得一试的~
|
113 |
+
如果觉得好用,欢迎支持我一杯 [咖啡](https://ko-fi.com/mikolisa) :)
|
config.json
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"architectures": [
|
3 |
+
"LlamaForCausalLM"
|
4 |
+
],
|
5 |
+
"bos_token_id": 1,
|
6 |
+
"eos_token_id": 2,
|
7 |
+
"hidden_act": "silu",
|
8 |
+
"hidden_size": 7168,
|
9 |
+
"initializer_range": 0.02,
|
10 |
+
"intermediate_size": 20480,
|
11 |
+
"max_position_embeddings": 4096,
|
12 |
+
"model_type": "llama",
|
13 |
+
"num_attention_heads": 56,
|
14 |
+
"num_hidden_layers": 60,
|
15 |
+
"num_key_value_heads": 8,
|
16 |
+
"pad_token_id": 0,
|
17 |
+
"rms_norm_eps": 1e-05,
|
18 |
+
"rope_theta": 5000000.0,
|
19 |
+
"tie_word_embeddings": false,
|
20 |
+
"torch_dtype": "bfloat16",
|
21 |
+
"transformers_version": "4.34.0",
|
22 |
+
"use_cache": true,
|
23 |
+
"vocab_size": 64000
|
24 |
+
}
|
model-00001-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3d5055918493c5d4d1494af90923d9bbf3511bf98eaac2bce1cbf52efdb4bbc6
|
3 |
+
size 1739588416
|
model-00002-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1c4ad38f8ffb1b535f4af2bd5e56cdeba39fd19519f19f63912b5522a416bb1d
|
3 |
+
size 1937827760
|
model-00003-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:491799959f11f995a29728384c3f2f478eee4e721a2774db8055be59f19d6129
|
3 |
+
size 1937827760
|
model-00004-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d9e27b1fcb26cdb4d6ef8cefc5777cada99c114972857fbbd30562572e10e413
|
3 |
+
size 1996547664
|
model-00005-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:39e6c9e947fbc956a5a4aa4f87e4c8fce749155a6b6d6b523fd1ebc5ec2fa775
|
3 |
+
size 1937798856
|
model-00006-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:449b1aadfe6f7ba33b0880f9fea407090a0cd21016ff5ea33cec14268fab26db
|
3 |
+
size 1937827760
|
model-00007-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a3d5b79ecec115ca8a239823d927f63d31d448e142ad97340951527be5a02b3e
|
3 |
+
size 1937827768
|
model-00008-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e75982e6989bc7b5084d4f4c2f3d68d70f3d29e616ab4f29fa10a2296d31fff9
|
3 |
+
size 1996547680
|
model-00009-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:15508b0118519b5604078b1c3e8e03f7cd9301891af2435fec5886739690a186
|
3 |
+
size 1937798872
|
model-00010-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4febaf994abd7dc44faab650732d458693f7e9ed9fd946cc3e211614ddadb031
|
3 |
+
size 1937827776
|
model-00011-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e025e79c7c8f609c822e5cfb56544cff1a5edf265f9f1d70fe50b3b8a26dc49e
|
3 |
+
size 1937827776
|
model-00012-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7f62cc690273d75664149268c9a1214d869e7448e2409e2c8a4d2f2f51233d3d
|
3 |
+
size 1996547680
|
model-00013-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:61231f56ac3b5c69c2c13dc096f9c5ad1eef52da5ebd2dfb77d22d127926145d
|
3 |
+
size 1937798872
|
model-00014-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:68f3137cd9d67cce714bf5faa427327b5595db10012879d17e2554d27f8bd936
|
3 |
+
size 1937827776
|
model-00015-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:939d938613df2c500b3abfa7ae4d0691a3d2864670131cfa8796b49b7358a0be
|
3 |
+
size 1937827776
|
model-00016-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:62afec142b9d2910e56d12b3edd6cbb1afbf4a9a940adc4ecfffe8fdbc7aed2a
|
3 |
+
size 1996547680
|
model-00017-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:db1e6f3f433fe8abd67dcf410d201186e9d436c59a6c4c50e5de8bcb00bd0dd7
|
3 |
+
size 1937798872
|
model-00018-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c251c2c54a2b085a6415a1b3a5893c940046aa352edc8b0250d75e7e919154c4
|
3 |
+
size 1937827776
|
model-00019-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d6154ea3a94816071b6c4b6d4dd205f731954b0017723b03908e836b55ebd592
|
3 |
+
size 1937827776
|
model-00020-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fab507f5bc4df992c3d035176026b9f91a9f94d5f8de5b498445673c4d261727
|
3 |
+
size 1996547680
|
model-00021-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:44a7234c0b720068113f7b4aa04f75724ddfcb51df02e1466cf3c03321fd9ffd
|
3 |
+
size 1937798872
|
model-00022-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f9d36367fa600bca96643d0717b79072cd8ec50e1e7fff3a437156e59709f564
|
3 |
+
size 1937827776
|
model-00023-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ff0a4bf82e4015dda2ccfa24fb9dbf89acd4898a2aac2adf18db66236f87111e
|
3 |
+
size 1937827776
|
model-00024-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1cf83ac0b8899c5fb4de1f1371352bff3b5954b217dfd7024e427b228992f50d
|
3 |
+
size 1996547680
|
model-00025-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b8578d1b3e89f14a63f49e7636b4ded996da42f5f6360780c0c078670dcaf013
|
3 |
+
size 1937798872
|
model-00026-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fe90d33aa35dbcf4697b9b5cf6028dd410440514a0850b7c392a300c030da4fd
|
3 |
+
size 1937827776
|
model-00027-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e6856ceaf7b066a60fa6a27d1dd99735b7d6c75947a479dae706d468ea5fd1dc
|
3 |
+
size 1937827776
|
model-00028-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:06c21e848a183a8853ba89d5ef197bb899e9e197856be07939f10c7b134d4bfa
|
3 |
+
size 1996547680
|
model-00029-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8239edf29039d5b1be0b22b846f7e472a2e83e92469742533d853e84ce27d40b
|
3 |
+
size 1937798872
|
model-00030-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7357657e4ccef5f97b919def72a568f8023c3b5a4808ed444a975fcf9c2c4940
|
3 |
+
size 1937827776
|
model-00031-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a9286b2c224fbb13613e0649bd22bbc4a76ec11441221db7acf5714b0e1a9c17
|
3 |
+
size 1937827776
|
model-00032-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c0faa11c44f81101788f3373128de5aeeed0f0c2ff04f271ac5bd39b0cb8adb4
|
3 |
+
size 1996547680
|
model-00033-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2b078c8e9ce48da04785ea66f55d9ccc7f5847d6ed2dc36aea7a841b123bcab4
|
3 |
+
size 1937798872
|
model-00034-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3aeec6bb405ea0c3778270e6abc83946b6a4f03560ecaee904239f0c4acd0e7a
|
3 |
+
size 1937827776
|
model-00035-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4fbb686f7cd15d372333478313b3efc53d251cc5e04e86b96c510e9c8a4969ac
|
3 |
+
size 1702960712
|
model-00036-of-00036.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:00605ef7805b07c6b8d5a1071336b6c35bfb5848ea2e55e7d34dbe78a133e59b
|
3 |
+
size 917504128
|
model.safetensors.index.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"metadata": {"total_size": 68777834496}, "weight_map": {"lm_head.weight": "model-00036-of-00036.safetensors", "model.embed_tokens.weight": "model-00001-of-00036.safetensors", "model.layers.0.input_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00002-of-00036.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00036.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00036.safetensors", "model.layers.1.input_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00002-of-00036.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00002-of-00036.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00002-of-00036.safetensors", "model.layers.10.input_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00007-of-00036.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.input_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00008-of-00036.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00008-of-00036.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00008-of-00036.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00007-of-00036.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00007-of-00036.safetensors", "model.layers.12.input_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00008-of-00036.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00008-of-00036.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00008-of-00036.safetensors", "model.layers.13.input_layernorm.weight": "model-00009-of-00036.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00009-of-00036.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00009-of-00036.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.input_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00010-of-00036.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00009-of-00036.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00009-of-00036.safetensors", "model.layers.15.input_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00010-of-00036.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00010-of-00036.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.input_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00011-of-00036.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00011-of-00036.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00010-of-00036.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00010-of-00036.safetensors", "model.layers.17.input_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00011-of-00036.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00011-of-00036.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.input_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00012-of-00036.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00012-of-00036.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00012-of-00036.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00011-of-00036.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00011-of-00036.safetensors", "model.layers.19.input_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00012-of-00036.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00012-of-00036.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00012-of-00036.safetensors", "model.layers.2.input_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00003-of-00036.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00003-of-00036.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00036.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00036.safetensors", "model.layers.20.input_layernorm.weight": "model-00013-of-00036.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00013-of-00036.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00013-of-00036.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.input_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00014-of-00036.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00013-of-00036.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00013-of-00036.safetensors", "model.layers.22.input_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00014-of-00036.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00014-of-00036.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.input_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00015-of-00036.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00015-of-00036.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00014-of-00036.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00014-of-00036.safetensors", "model.layers.24.input_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00015-of-00036.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00015-of-00036.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.input_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00016-of-00036.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00016-of-00036.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00016-of-00036.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00015-of-00036.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00015-of-00036.safetensors", "model.layers.26.input_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00016-of-00036.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00016-of-00036.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00016-of-00036.safetensors", "model.layers.27.input_layernorm.weight": "model-00017-of-00036.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00017-of-00036.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00017-of-00036.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.input_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00018-of-00036.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00017-of-00036.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00017-of-00036.safetensors", "model.layers.29.input_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00018-of-00036.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00018-of-00036.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00018-of-00036.safetensors", "model.layers.3.input_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00003-of-00036.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00003-of-00036.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00003-of-00036.safetensors", "model.layers.30.input_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00019-of-00036.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00019-of-00036.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00018-of-00036.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00018-of-00036.safetensors", "model.layers.31.input_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00019-of-00036.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00019-of-00036.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.input_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00020-of-00036.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00020-of-00036.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00020-of-00036.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00019-of-00036.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00019-of-00036.safetensors", "model.layers.33.input_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00020-of-00036.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00020-of-00036.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00020-of-00036.safetensors", "model.layers.34.input_layernorm.weight": "model-00021-of-00036.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00021-of-00036.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.o_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00021-of-00036.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.input_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00022-of-00036.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00021-of-00036.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00021-of-00036.safetensors", "model.layers.36.input_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.36.post_attention_layernorm.weight": "model-00022-of-00036.safetensors", "model.layers.36.mlp.down_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.mlp.gate_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.mlp.up_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.k_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.o_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.q_proj.weight": "model-00022-of-00036.safetensors", "model.layers.36.self_attn.v_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.input_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.37.post_attention_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.37.mlp.down_proj.weight": "model-00023-of-00036.safetensors", "model.layers.37.mlp.gate_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.mlp.up_proj.weight": "model-00023-of-00036.safetensors", "model.layers.37.self_attn.k_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.self_attn.o_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.self_attn.q_proj.weight": "model-00022-of-00036.safetensors", "model.layers.37.self_attn.v_proj.weight": "model-00022-of-00036.safetensors", "model.layers.38.input_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.38.post_attention_layernorm.weight": "model-00023-of-00036.safetensors", "model.layers.38.mlp.down_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.mlp.gate_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.mlp.up_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.k_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.o_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.q_proj.weight": "model-00023-of-00036.safetensors", "model.layers.38.self_attn.v_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.input_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.39.post_attention_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.39.mlp.down_proj.weight": "model-00024-of-00036.safetensors", "model.layers.39.mlp.gate_proj.weight": "model-00024-of-00036.safetensors", "model.layers.39.mlp.up_proj.weight": "model-00024-of-00036.safetensors", "model.layers.39.self_attn.k_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.self_attn.o_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.self_attn.q_proj.weight": "model-00023-of-00036.safetensors", "model.layers.39.self_attn.v_proj.weight": "model-00023-of-00036.safetensors", "model.layers.4.input_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00004-of-00036.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00004-of-00036.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00004-of-00036.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00003-of-00036.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00003-of-00036.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00003-of-00036.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00003-of-00036.safetensors", "model.layers.40.input_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.40.post_attention_layernorm.weight": "model-00024-of-00036.safetensors", "model.layers.40.mlp.down_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.mlp.gate_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.mlp.up_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.k_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.o_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.q_proj.weight": "model-00024-of-00036.safetensors", "model.layers.40.self_attn.v_proj.weight": "model-00024-of-00036.safetensors", "model.layers.41.input_layernorm.weight": "model-00025-of-00036.safetensors", "model.layers.41.post_attention_layernorm.weight": "model-00025-of-00036.safetensors", "model.layers.41.mlp.down_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.mlp.gate_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.mlp.up_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.k_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.o_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.q_proj.weight": "model-00025-of-00036.safetensors", "model.layers.41.self_attn.v_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.input_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.42.post_attention_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.42.mlp.down_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.mlp.gate_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.mlp.up_proj.weight": "model-00026-of-00036.safetensors", "model.layers.42.self_attn.k_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.self_attn.o_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.self_attn.q_proj.weight": "model-00025-of-00036.safetensors", "model.layers.42.self_attn.v_proj.weight": "model-00025-of-00036.safetensors", "model.layers.43.input_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.43.post_attention_layernorm.weight": "model-00026-of-00036.safetensors", "model.layers.43.mlp.down_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.mlp.gate_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.mlp.up_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.k_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.o_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.q_proj.weight": "model-00026-of-00036.safetensors", "model.layers.43.self_attn.v_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.input_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.44.post_attention_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.44.mlp.down_proj.weight": "model-00027-of-00036.safetensors", "model.layers.44.mlp.gate_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.mlp.up_proj.weight": "model-00027-of-00036.safetensors", "model.layers.44.self_attn.k_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.self_attn.o_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.self_attn.q_proj.weight": "model-00026-of-00036.safetensors", "model.layers.44.self_attn.v_proj.weight": "model-00026-of-00036.safetensors", "model.layers.45.input_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.45.post_attention_layernorm.weight": "model-00027-of-00036.safetensors", "model.layers.45.mlp.down_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.mlp.gate_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.mlp.up_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.k_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.o_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.q_proj.weight": "model-00027-of-00036.safetensors", "model.layers.45.self_attn.v_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.input_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.46.post_attention_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.46.mlp.down_proj.weight": "model-00028-of-00036.safetensors", "model.layers.46.mlp.gate_proj.weight": "model-00028-of-00036.safetensors", "model.layers.46.mlp.up_proj.weight": "model-00028-of-00036.safetensors", "model.layers.46.self_attn.k_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.self_attn.o_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.self_attn.q_proj.weight": "model-00027-of-00036.safetensors", "model.layers.46.self_attn.v_proj.weight": "model-00027-of-00036.safetensors", "model.layers.47.input_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.47.post_attention_layernorm.weight": "model-00028-of-00036.safetensors", "model.layers.47.mlp.down_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.mlp.gate_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.mlp.up_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.k_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.o_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.q_proj.weight": "model-00028-of-00036.safetensors", "model.layers.47.self_attn.v_proj.weight": "model-00028-of-00036.safetensors", "model.layers.48.input_layernorm.weight": "model-00029-of-00036.safetensors", "model.layers.48.post_attention_layernorm.weight": "model-00029-of-00036.safetensors", "model.layers.48.mlp.down_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.mlp.gate_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.mlp.up_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.k_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.o_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.q_proj.weight": "model-00029-of-00036.safetensors", "model.layers.48.self_attn.v_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.input_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.49.post_attention_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.49.mlp.down_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.mlp.gate_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.mlp.up_proj.weight": "model-00030-of-00036.safetensors", "model.layers.49.self_attn.k_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.self_attn.o_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.self_attn.q_proj.weight": "model-00029-of-00036.safetensors", "model.layers.49.self_attn.v_proj.weight": "model-00029-of-00036.safetensors", "model.layers.5.input_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00004-of-00036.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00004-of-00036.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00004-of-00036.safetensors", "model.layers.50.input_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.50.post_attention_layernorm.weight": "model-00030-of-00036.safetensors", "model.layers.50.mlp.down_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.mlp.gate_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.mlp.up_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.k_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.o_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.q_proj.weight": "model-00030-of-00036.safetensors", "model.layers.50.self_attn.v_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.input_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.51.post_attention_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.51.mlp.down_proj.weight": "model-00031-of-00036.safetensors", "model.layers.51.mlp.gate_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.mlp.up_proj.weight": "model-00031-of-00036.safetensors", "model.layers.51.self_attn.k_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.self_attn.o_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.self_attn.q_proj.weight": "model-00030-of-00036.safetensors", "model.layers.51.self_attn.v_proj.weight": "model-00030-of-00036.safetensors", "model.layers.52.input_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.52.post_attention_layernorm.weight": "model-00031-of-00036.safetensors", "model.layers.52.mlp.down_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.mlp.gate_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.mlp.up_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.k_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.o_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.q_proj.weight": "model-00031-of-00036.safetensors", "model.layers.52.self_attn.v_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.input_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.53.post_attention_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.53.mlp.down_proj.weight": "model-00032-of-00036.safetensors", "model.layers.53.mlp.gate_proj.weight": "model-00032-of-00036.safetensors", "model.layers.53.mlp.up_proj.weight": "model-00032-of-00036.safetensors", "model.layers.53.self_attn.k_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.self_attn.o_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.self_attn.q_proj.weight": "model-00031-of-00036.safetensors", "model.layers.53.self_attn.v_proj.weight": "model-00031-of-00036.safetensors", "model.layers.54.input_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.54.post_attention_layernorm.weight": "model-00032-of-00036.safetensors", "model.layers.54.mlp.down_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.mlp.gate_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.mlp.up_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.k_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.o_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.q_proj.weight": "model-00032-of-00036.safetensors", "model.layers.54.self_attn.v_proj.weight": "model-00032-of-00036.safetensors", "model.layers.55.input_layernorm.weight": "model-00033-of-00036.safetensors", "model.layers.55.post_attention_layernorm.weight": "model-00033-of-00036.safetensors", "model.layers.55.mlp.down_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.mlp.gate_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.mlp.up_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.k_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.o_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.q_proj.weight": "model-00033-of-00036.safetensors", "model.layers.55.self_attn.v_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.input_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.56.post_attention_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.56.mlp.down_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.mlp.gate_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.mlp.up_proj.weight": "model-00034-of-00036.safetensors", "model.layers.56.self_attn.k_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.self_attn.o_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.self_attn.q_proj.weight": "model-00033-of-00036.safetensors", "model.layers.56.self_attn.v_proj.weight": "model-00033-of-00036.safetensors", "model.layers.57.input_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.57.post_attention_layernorm.weight": "model-00034-of-00036.safetensors", "model.layers.57.mlp.down_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.mlp.gate_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.mlp.up_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.k_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.o_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.q_proj.weight": "model-00034-of-00036.safetensors", "model.layers.57.self_attn.v_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.input_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.58.post_attention_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.58.mlp.down_proj.weight": "model-00035-of-00036.safetensors", "model.layers.58.mlp.gate_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.mlp.up_proj.weight": "model-00035-of-00036.safetensors", "model.layers.58.self_attn.k_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.self_attn.o_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.self_attn.q_proj.weight": "model-00034-of-00036.safetensors", "model.layers.58.self_attn.v_proj.weight": "model-00034-of-00036.safetensors", "model.layers.59.input_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.59.post_attention_layernorm.weight": "model-00035-of-00036.safetensors", "model.layers.59.mlp.down_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.mlp.gate_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.mlp.up_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.k_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.o_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.q_proj.weight": "model-00035-of-00036.safetensors", "model.layers.59.self_attn.v_proj.weight": "model-00035-of-00036.safetensors", "model.layers.6.input_layernorm.weight": "model-00005-of-00036.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00005-of-00036.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00005-of-00036.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.input_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00006-of-00036.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00005-of-00036.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00005-of-00036.safetensors", "model.layers.8.input_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00006-of-00036.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00006-of-00036.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.input_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00007-of-00036.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00007-of-00036.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00007-of-00036.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00006-of-00036.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00006-of-00036.safetensors", "model.norm.weight": "model-00035-of-00036.safetensors"}}
|
special_tokens_map.json
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": {
|
3 |
+
"content": "<|startoftext|>",
|
4 |
+
"lstrip": false,
|
5 |
+
"normalized": false,
|
6 |
+
"rstrip": false,
|
7 |
+
"single_word": false
|
8 |
+
},
|
9 |
+
"eos_token": {
|
10 |
+
"content": "<|endoftext|>",
|
11 |
+
"lstrip": false,
|
12 |
+
"normalized": false,
|
13 |
+
"rstrip": false,
|
14 |
+
"single_word": false
|
15 |
+
},
|
16 |
+
"pad_token": {
|
17 |
+
"content": "<unk>",
|
18 |
+
"lstrip": false,
|
19 |
+
"normalized": false,
|
20 |
+
"rstrip": false,
|
21 |
+
"single_word": false
|
22 |
+
},
|
23 |
+
"unk_token": {
|
24 |
+
"content": "<unk>",
|
25 |
+
"lstrip": false,
|
26 |
+
"normalized": false,
|
27 |
+
"rstrip": false,
|
28 |
+
"single_word": false
|
29 |
+
}
|
30 |
+
}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer.model
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:386c49cf943d71aa110361135338c50e38beeff0a66593480421f37b319e1a39
|
3 |
+
size 1033105
|
tokenizer_config.json
ADDED
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_bos_token": false,
|
3 |
+
"add_eos_token": false,
|
4 |
+
"added_tokens_decoder": {
|
5 |
+
"0": {
|
6 |
+
"content": "<unk>",
|
7 |
+
"lstrip": false,
|
8 |
+
"normalized": false,
|
9 |
+
"rstrip": false,
|
10 |
+
"single_word": false,
|
11 |
+
"special": true
|
12 |
+
},
|
13 |
+
"1": {
|
14 |
+
"content": "<|startoftext|>",
|
15 |
+
"lstrip": false,
|
16 |
+
"normalized": false,
|
17 |
+
"rstrip": false,
|
18 |
+
"single_word": false,
|
19 |
+
"special": true
|
20 |
+
},
|
21 |
+
"2": {
|
22 |
+
"content": "<|endoftext|>",
|
23 |
+
"lstrip": false,
|
24 |
+
"normalized": false,
|
25 |
+
"rstrip": false,
|
26 |
+
"single_word": false,
|
27 |
+
"special": true
|
28 |
+
}
|
29 |
+
},
|
30 |
+
"bos_token": "<|startoftext|>",
|
31 |
+
"clean_up_tokenization_spaces": false,
|
32 |
+
"eos_token": "<|endoftext|>",
|
33 |
+
"legacy": false,
|
34 |
+
"model_max_length": 4096,
|
35 |
+
"pad_token": "<unk>",
|
36 |
+
"padding_side": "right",
|
37 |
+
"sp_model_kwargs": {},
|
38 |
+
"spaces_between_special_tokens": false,
|
39 |
+
"tokenizer_class": "LlamaTokenizer",
|
40 |
+
"truncation_side": "right",
|
41 |
+
"unk_token": "<unk>",
|
42 |
+
"use_default_system_prompt": false
|
43 |
+
}
|