ToastyPigeon
/

psyonic-cetacean-20b-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ToastyPigeon commited on Mar 28

Commit

bb97674

•

1 Parent(s): cd2cd22

Update README.md

Files changed (1) hide show

README.md +56 -15

README.md CHANGED Viewed

@@ -3,42 +3,83 @@ base_model: []
 tags:
 - mergekit
 - merge
 ---
-# Psycet-V2
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details
 ### Merge Method
-This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.
 ### Models Merged
 The following models were included in the merge:
-* E:\ModelMerge\merges\Psycet-V2\Psycet
-* E:\ModelMerge\merges\Psycet-V2\Psycet-Reverse
 ### Configuration
 The following YAML configuration was used to produce this model:
 ```yaml
 dtype: float16
-merge_method: linear
 slices:
-- sources:
-  - layer_range: [0, 62]
-    model:
-      model:
-        path: E:\ModelMerge\merges\Psycet-V2\Psycet
     parameters:
       weight: 0.5
-  - layer_range: [0, 62]
-    model:
-      model:
-        path: E:\ModelMerge\merges\Psycet-V2\Psycet-Reverse
     parameters:
       weight: 0.5
 ```

 tags:
 - mergekit
 - merge
 ---
+# Psyonic-Cetacean-20B-V2
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details
 ### Merge Method
+This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method on two stack-merged models.
+The first is [jebcarter/psyonic-cetacean-20B](https://huggingface.co/jebcarter/psyonic-cetacean-20B)
+(Orca first, reproduced so I didn't have to download that model on top of the components).
+The second is the same recipe with the models reversed.
+Since [jebcarter](https://huggingface.co/jebcarter) suggested this recipe, credit goes to him.
 ### Models Merged
 The following models were included in the merge:
+* microsoft/Orca-2-13b
+* KoboldAI/LLaMA2-13B-Psyfighter2
 ### Configuration
 The following YAML configuration was used to produce this model:
 ```yaml
+models:
+  - model: microsoft/Orca-2-13b
+    parameters:
+      weight: 1.0
+merge_method: task_arithmetic
+base_model: TheBloke/Llama-2-13B-fp16
 dtype: float16
+name: FlatOrca2
+---
 slices:
+  - sources:
+    - model: FlatOrca2
+      layer_range: [0, 16]
+  - sources:
+    - model: KoboldAI/LLaMA2-13B-Psyfighter2
+      layer_range: [8, 24]
+  - sources:
+    - model: FlatOrca2
+      layer_range: [17, 32]
+  - sources:
+    - model: KoboldAI/LLaMA2-13B-Psyfighter2
+      layer_range: [25, 40]
+merge_method: passthrough
+dtype: float16
+name: Psycet
+---
+slices:
+  - sources:
+    - model: KoboldAI/LLaMA2-13B-Psyfighter2
+      layer_range: [0, 16]
+  - sources:
+    - model: FlatOrca2
+      layer_range: [8, 24]
+  - sources:
+    - model: KoboldAI/LLaMA2-13B-Psyfighter2
+      layer_range: [17, 32]
+  - sources:
+    - model: FlatOrca2
+      layer_range: [25, 40]
+merge_method: passthrough
+dtype: float16
+name: Psycet-Reverse
+---
+models:
+  - model: Psycet
     parameters:
       weight: 0.5
+  - model: Psycet-Reverse
     parameters:
       weight: 0.5
+merge_method: linear
+dtype: float16
 ```