wolfram commited on
Commit
58ffed3
β€’
1 Parent(s): c92c12b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -5
README.md CHANGED
@@ -1,17 +1,53 @@
1
  ---
2
  base_model:
3
  - 152334H/miqu-1-70b-sf
 
 
 
 
 
 
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
9
  ---
10
- # wolfram_miqu-1-103b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  ## Merge Details
 
15
  ### Merge Method
16
 
17
  This model was merged using the passthrough merge method.
@@ -19,12 +55,15 @@ This model was merged using the passthrough merge method.
19
  ### Models Merged
20
 
21
  The following models were included in the merge:
22
- * [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
 
23
 
24
  ### Configuration
25
 
26
  The following YAML configuration was used to produce this model:
27
 
 
 
28
  ```yaml
29
  dtype: float16
30
  merge_method: passthrough
@@ -38,5 +77,33 @@ slices:
38
  - sources:
39
  - layer_range: [40, 80]
40
  model: 152334H/miqu-1-70b-sf
41
-
42
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model:
3
  - 152334H/miqu-1-70b-sf
4
+ language:
5
+ - en
6
+ - de
7
+ - fr
8
+ - es
9
+ - it
10
  library_name: transformers
11
  tags:
12
  - mergekit
13
  - merge
 
14
  ---
15
+ # miqu-1-103b
16
+
17
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6303ca537373aacccd85d8a7/LxO9j7OykuabKLYQHIodG.jpeg)
18
+
19
+ - HF: [wolfram/miqu-1-103b](https://huggingface.co/wolfram/miqu-1-103b)
20
+ - GGUF: mradermacher's [static quants](https://huggingface.co/mradermacher/miqu-1-103b-GGUF) | [weighted/imatrix quants](https://huggingface.co/mradermacher/miqu-1-103b-i1-GGUF)
21
+ - EXL2: wolfram/miqu-1-103b-5.0bpw-h6-exl2 | LoneStriker's [2.4bpw](https://huggingface.co/LoneStriker/miqu-1-103b-2.4bpw-h6-exl2) | [3.0bpw](https://huggingface.co/LoneStriker/miqu-1-103b-3.0bpw-h6-exl2) | [3.5bpw](https://huggingface.co/LoneStriker/miqu-1-103b-3.5bpw-h6-exl2)
22
+
23
+ This is a 103b frankenmerge of [miqu-1-70b](https://huggingface.co/miqudev/miqu-1-70b) created by interleaving layers of [miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf) with itself using [mergekit](https://github.com/cg123/mergekit).
24
+
25
+ Inspired by [Midnight-Rose-103B-v2.0.3](https://huggingface.co/sophosympatheia/Midnight-Rose-103B-v2.0.3).
26
+
27
+ Thanks for the support, [CopilotKit](https://github.com/CopilotKit/CopilotKit) - the open-source platform for building in-app AI Copilots into any product, with any LLM model. Check out their GitHub.
28
+
29
+ Thanks for the quants, [Michael Radermacher](https://huggingface.co/mradermacher) and [Lone Striker](https://huggingface.co/LoneStriker)!
30
+
31
+ Also available:
32
 
33
+ - [miqu-1-120b](https://huggingface.co/wolfram/miqu-1-120b) – Miqu's older, bigger twin sister; same Miqu, inflated to 120B.
34
+ - [miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0) – Miqu's younger, fresher sister; a new and improved Goliath-like merge of Miqu and lzlv.
35
+
36
+ ## Model Details
37
+
38
+ - Max Context: 32768 tokens
39
+ - Layers: 120
40
+
41
+ ### Prompt template: Mistral
42
+
43
+ ```
44
+ <s>[INST] {prompt} [/INST]
45
+ ```
46
+
47
+ See also: [πŸΊπŸ¦β€β¬› LLM Prompt Format Comparison/Test: Mixtral 8x7B Instruct with **17** different instruct templates : LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/comments/18ljvxb/llm_prompt_format_comparisontest_mixtral_8x7b/)
48
 
49
  ## Merge Details
50
+
51
  ### Merge Method
52
 
53
  This model was merged using the passthrough merge method.
 
55
  ### Models Merged
56
 
57
  The following models were included in the merge:
58
+
59
+ - [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
60
 
61
  ### Configuration
62
 
63
  The following YAML configuration was used to produce this model:
64
 
65
+ <details><summary>mergekit_config.yml</summary>
66
+
67
  ```yaml
68
  dtype: float16
69
  merge_method: passthrough
 
77
  - sources:
78
  - layer_range: [40, 80]
79
  model: 152334H/miqu-1-70b-sf
 
80
  ```
81
+
82
+ </details>
83
+
84
+ ## Credits & Special Thanks
85
+
86
+ - original (unreleased) model: [mistralai (Mistral AI_)](https://huggingface.co/mistralai)
87
+ - ⭐⭐⭐ **[Use their newer, better, official models here!](https://console.mistral.ai/)** ⭐⭐⭐
88
+ - leaked model: [miqudev/miqu-1-70b](https://huggingface.co/miqudev/miqu-1-70b)
89
+ - f16 model: [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
90
+ - mergekit: [arcee-ai/mergekit: Tools for merging pretrained large language models.](https://github.com/arcee-ai/mergekit)
91
+ - mergekit_config.yml: [sophosympatheia/Midnight-Rose-103B-v2.0.3](https://huggingface.co/sophosympatheia/Midnight-Rose-103B-v2.0.3)
92
+
93
+ ### Support
94
+
95
+ - [My Ko-fi page](https://ko-fi.com/wolframravenwolf) if you'd like to tip me to say thanks or request specific models to be tested or merged with priority. Also consider supporting your favorite model creators, quantizers, or frontend/backend devs if you can afford to do so. They deserve it!
96
+
97
+ ## Disclaimer
98
+
99
+ *This model contains leaked weights and due to its content it should not be used by anyone.* 😜
100
+
101
+ But seriously:
102
+
103
+ ### License
104
+
105
+ **What I *know*:** [Weights produced by a machine are not copyrightable](https://www.reddit.com/r/LocalLLaMA/comments/1amc080/psa_if_you_use_miqu_or_a_derivative_please_keep/kpmamte/) so there is no copyright owner who could grant permission or a license to use, or restrict usage, once you have acquired the files.
106
+
107
+ ### Ethics
108
+
109
+ **What I *believe*:** All generative AI, including LLMs, only exists because it is trained mostly on human data (both public domain and copyright-protected, most likely acquired without express consent) and possibly synthetic data (which is ultimately derived from human data, too). It is only fair if something that is based on everyone's knowledge and data is also freely accessible to the public, the actual creators of the underlying content. Fair use, fair AI!