brooketh commited on
Commit
bb42e39
1 Parent(s): 4a2ec14

Upload 9 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Senku-70B-Full.imatrix filter=lfs diff=lfs merge=lfs -text
37
+ Senku-70B-Full.IQ2_XS.gguf filter=lfs diff=lfs merge=lfs -text
38
+ Senku-70B-Full.IQ2_XXS.gguf filter=lfs diff=lfs merge=lfs -text
39
+ Senku-70B-Full.IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
40
+ Senku-70B-Full.IQ3_XXS.gguf filter=lfs diff=lfs merge=lfs -text
Faraday Model Repository Banner.png ADDED
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: ShinojiResearch/Senku-70B-Full
3
+ license: other
4
+ language:
5
+ - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ quantized_by: brooketh
9
+ tags:
10
+ - roleplay
11
+ - text-generation-inference
12
+ ---
13
+ <img src="Faraday Model Repository Banner.png" alt="Faraday.dev" style="height: 90px; min-width: 32px; display: block; margin: auto;">
14
+
15
+ **<p style="text-align: center;">The official library of GGUF format models for use in the local AI chat app, Faraday.dev.</p>**
16
+
17
+ <p style="text-align: center;"><a href="https://faraday.dev/">Download Faraday here to get started.</a></p>
18
+
19
+ <p style="text-align: center;"><a href="https://www.reddit.com/r/LLM_Quants/">Request Additional models at r/LLM_Quants.</a></p>
20
+
21
+ ***
22
+ # Senku 70B Full
23
+ - **Creator:** [ShinojiResearch](https://huggingface.co/ShinojiResearch/)
24
+ - **Original:** [Cerebrum 1.0 8x7b](https://huggingface.co/ShinojiResearch/Senku-70B-Full)
25
+ - **Date Created:** 2024-02-06
26
+ - **Trained Context:** 8192 tokens
27
+ - **Description:** Finetune of Mistral-70B on the Slimorca dataset. Exceptional at roleplay with the highest EQ Bench scores to date. Recommended for use with the ChatML prompt format.
28
+
29
+ ## What is a GGUF?
30
+ GGUF is a large language model (LLM) format that can be split between CPU and GPU. GGUFs are compatible with applications based on llama.cpp, such as Faraday.dev. Where other model formats require higher end GPUs with ample VRAM, GGUFs can be efficiently run on a wider variety of hardware.
31
+ GGUF models are quantized to reduce resource usage, with a tradeoff of reduced coherence at lower quantizations. Quantization reduces the precision of the model weights by changing the number of bits used for each weight.
32
+
33
+ ***
34
+ <img src="faraday-logo.png" alt="Faraday.dev" style="height: 75px; min-width: 32px; display: block; horizontal align: left;">
35
+
36
+ ## Faraday.dev
37
+ - Free, local AI chat application.
38
+ - One-click installation on Mac and PC.
39
+ - Automatically use GPU for maximum speed.
40
+ - Built-in model manager.
41
+ - High-quality character hub.
42
+ - Zero-config desktop-to-mobile tethering.
43
+ Faraday makes it easy to start chatting with AI using your own characters or one of the many found in the built-in character hub. The model manager helps you find the latest and greatest models without worrying about whether it's the correct format. Faraday supports advanced features such as lorebooks, author's note, text formatting, custom context size, sampler settings, grammars, local TTS, cloud inference, and tethering, all implemented in a way that is straightforward and reliable.
44
+ **Join us on [Discord](https://discord.gg/SyNN2vC9tQ)**
45
+ ***
Senku-70B-Full.IQ2_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e23a3158cef32db850ad6ea05f2e82ef2621b5e9c4fb48bf4f34105545ecfcf
3
+ size 20334163520
Senku-70B-Full.IQ2_XXS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eaad5135beebc2e8c9fc7f98010e79dc9b9c297c656d71ca1c3f683647213f00
3
+ size 18289440320
Senku-70B-Full.IQ3_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5adabd1dba5119269b496098658f35088b018c055ef8d0d570632752fe8bbd3f
3
+ size 28314973760
Senku-70B-Full.IQ3_XXS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0464fd90a3c886979a553a2ed812848f9652f5f4d5c8ad73dcef60c6ea104a2b
3
+ size 26581464640
Senku-70B-Full.imatrix ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e722beb9027460e50a1baf264c16a5d4f9c43c6be2e02d2df152c2e6b0873c23
3
+ size 24922254
faraday-logo.png ADDED
main.log ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [1712111096] Log start
2
+ [1712111096] Cmd: c:\Apps\Toaster\bin\main.exe -m Senku-70B-Full.IQ2_XS.gguf
3
+ [1712111096] main: build = 2589 (bdf85d09)
4
+ [1712111096] main: built with MSVC 19.39.33523.0 for x64
5
+ [1712111096] main: seed = 1712111096
6
+ [1712111096] main: llama backend init
7
+ [1712111096] main: load the model and apply lora adapter, if any
8
+ [1712111096] llama_model_loader: loaded meta data with 24 key-value pairs and 723 tensors from Senku-70B-Full.IQ2_XS.gguf (version GGUF V3 (latest))
9
+ [1712111096] llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
10
+ [1712111096] llama_model_loader: - kv 0: general.architecture str = llama
11
+ [1712111096] llama_model_loader: - kv 1: general.name str = models
12
+ [1712111096] llama_model_loader: - kv 2: llama.vocab_size u32 = 32000
13
+ [1712111096] llama_model_loader: - kv 3: llama.context_length u32 = 32764
14
+ [1712111096] llama_model_loader: - kv 4: llama.embedding_length u32 = 8192
15
+ [1712111096] llama_model_loader: - kv 5: llama.block_count u32 = 80
16
+ [1712111096] llama_model_loader: - kv 6: llama.feed_forward_length u32 = 28672
17
+ [1712111096] llama_model_loader: - kv 7: llama.rope.dimension_count u32 = 128
18
+ [1712111096] llama_model_loader: - kv 8: llama.attention.head_count u32 = 64
19
+ [1712111096] llama_model_loader: - kv 9: llama.attention.head_count_kv u32 = 8
20
+ [1712111096] llama_model_loader: - kv 10: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
21
+ [1712111096] llama_model_loader: - kv 11: llama.rope.freq_base f32 = 1000000.000000
22
+ [1712111096] llama_model_loader: - kv 12: general.file_type u32 = 20
23
+ [1712111096] llama_model_loader: - kv 13: tokenizer.ggml.model str = llama
24
+ [1712111096] llama_model_loader: - kv 14: tokenizer.ggml.tokens arr[str,32000] = ["<unk>", "<s>", "</s>", "<0x00>", "<...
25
+ [1712111096] llama_model_loader: - kv 15: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...
26
+ [1712111096] llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
27
+ [1712111096] llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 1
28
+ [1712111096] llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 2
29
+ [1712111096] llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 0
30
+ [1712111096] llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true
31
+ [1712111096] llama_model_loader: - kv 21: tokenizer.ggml.add_eos_token bool = false
32
+ [1712111096] llama_model_loader: - kv 22: tokenizer.chat_template str = {{ bos_token }}{% for message in mess...
33
+ [1712111096] llama_model_loader: - kv 23: general.quantization_version u32 = 2
34
+ [1712111096] llama_model_loader: - type f32: 161 tensors
35
+ [1712111096] llama_model_loader: - type q2_K: 11 tensors
36
+ [1712111096] llama_model_loader: - type q4_K: 80 tensors
37
+ [1712111096] llama_model_loader: - type q5_K: 1 tensors
38
+ [1712111096] llama_model_loader: - type iq2_xs: 470 tensors
39
+ [1712111097] llm_load_vocab: special tokens definition check successful ( 259/32000 ).
40
+ [1712111097] llm_load_print_meta: format = GGUF V3 (latest)
41
+ [1712111097] llm_load_print_meta: arch = llama
42
+ [1712111097] llm_load_print_meta: vocab type = SPM
43
+ [1712111097] llm_load_print_meta: n_vocab = 32000
44
+ [1712111097] llm_load_print_meta: n_merges = 0
45
+ [1712111097] llm_load_print_meta: n_ctx_train = 32764
46
+ [1712111097] llm_load_print_meta: n_embd = 8192
47
+ [1712111097] llm_load_print_meta: n_head = 64
48
+ [1712111097] llm_load_print_meta: n_head_kv = 8
49
+ [1712111097] llm_load_print_meta: n_layer = 80
50
+ [1712111097] llm_load_print_meta: n_rot = 128
51
+ [1712111097] llm_load_print_meta: n_embd_head_k = 128
52
+ [1712111097] llm_load_print_meta: n_embd_head_v = 128
53
+ [1712111097] llm_load_print_meta: n_gqa = 8
54
+ [1712111097] llm_load_print_meta: n_embd_k_gqa = 1024
55
+ [1712111097] llm_load_print_meta: n_embd_v_gqa = 1024
56
+ [1712111097] llm_load_print_meta: f_norm_eps = 0.0e+00
57
+ [1712111097] llm_load_print_meta: f_norm_rms_eps = 1.0e-05
58
+ [1712111097] llm_load_print_meta: f_clamp_kqv = 0.0e+00
59
+ [1712111097] llm_load_print_meta: f_max_alibi_bias = 0.0e+00
60
+ [1712111097] llm_load_print_meta: f_logit_scale = 0.0e+00
61
+ [1712111097] llm_load_print_meta: n_ff = 28672
62
+ [1712111097] llm_load_print_meta: n_expert = 0
63
+ [1712111097] llm_load_print_meta: n_expert_used = 0
64
+ [1712111097] llm_load_print_meta: causal attn = 1
65
+ [1712111097] llm_load_print_meta: pooling type = 0
66
+ [1712111097] llm_load_print_meta: rope type = 0
67
+ [1712111097] llm_load_print_meta: rope scaling = linear
68
+ [1712111097] llm_load_print_meta: freq_base_train = 1000000.0
69
+ [1712111097] llm_load_print_meta: freq_scale_train = 1
70
+ [1712111097] llm_load_print_meta: n_yarn_orig_ctx = 32764
71
+ [1712111097] llm_load_print_meta: rope_finetuned = unknown
72
+ [1712111097] llm_load_print_meta: ssm_d_conv = 0
73
+ [1712111097] llm_load_print_meta: ssm_d_inner = 0
74
+ [1712111097] llm_load_print_meta: ssm_d_state = 0
75
+ [1712111097] llm_load_print_meta: ssm_dt_rank = 0
76
+ [1712111097] llm_load_print_meta: model type = 70B
77
+ [1712111097] llm_load_print_meta: model ftype = IQ2_XS - 2.3125 bpw
78
+ [1712111097] llm_load_print_meta: model params = 68.98 B
79
+ [1712111097] llm_load_print_meta: model size = 18.94 GiB (2.36 BPW)
80
+ [1712111097] llm_load_print_meta: general.name = models
81
+ [1712111097] llm_load_print_meta: BOS token = 1 '<s>'
82
+ [1712111097] llm_load_print_meta: EOS token = 2 '</s>'
83
+ [1712111097] llm_load_print_meta: UNK token = 0 '<unk>'
84
+ [1712111097] llm_load_print_meta: PAD token = 0 '<unk>'
85
+ [1712111097] llm_load_print_meta: LF token = 13 '<0x0A>'
86
+ [1712111097] llm_load_tensors: ggml ctx size = 0.28 MiB