Undi95
/

Lumimaid-Magnum-v4-12B

Text Generation

text-generation-inference

Model card Files Files and versions Community

Undi95 commited on Dec 22, 2024

Commit

b8c5f69

·

verified ·

1 Parent(s): f256814

Update README.md

Files changed (1) hide show

README.md +7 -35

README.md CHANGED Viewed

@@ -1,50 +1,22 @@
 ---
 base_model:
-- anthracite-org/magnum-v4-12b
 - NeverSleep/Lumimaid-v0.2-12B
 - Undi95/LocalC-12B-e2.0
-- mistralai/Mistral-Nemo-Instruct-2407
 library_name: transformers
 tags:
 - mergekit
 - merge
 ---
-# out
-This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-## Merge Details
-### Merge Method
-This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) as a base.
-### Models Merged
-The following models were included in the merge:
-* [anthracite-org/magnum-v4-12b](https://huggingface.co/anthracite-org/magnum-v4-12b)
-* [NeverSleep/Lumimaid-v0.2-12B](https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B)
-* [Undi95/LocalC-12B-e2.0](https://huggingface.co/Undi95/LocalC-12B-e2.0)
-### Configuration
-The following YAML configuration was used to produce this model:
-```yaml
-base_model: mistralai/Mistral-Nemo-Instruct-2407
-merge_method: della
-dtype: bfloat16
-models:
-  - model: anthracite-org/magnum-v4-12b
-    parameters:
-      weight: 1.0
-  - model: Undi95/LocalC-12B-e2.0
-    parameters:
-      weight: 1.0
-  - model: NeverSleep/Lumimaid-v0.2-12B
-    parameters:
-      weight: 1.0
-  - model: mistralai/Mistral-Nemo-Instruct-2407
-    parameters:
-      weight: 1.0
 ```

 ---
 base_model:
+- mistralai/Mistral-Nemo-Instruct-2407
 - NeverSleep/Lumimaid-v0.2-12B
 - Undi95/LocalC-12B-e2.0
+- anthracite-org/magnum-v4-12b
 library_name: transformers
 tags:
 - mergekit
 - merge
 ---
+Merge of Lumimaid and Magnum as requested by some. <b>UPDATE : Magnum v4 used in this merge as asked [here](https://huggingface.co/Undi95/Lumimaid-Magnum-12B/discussions/4)</b>
+I used the DELLA merge method in mergekit and added a finetune of Nemo only on Claude input, trained on 16k ctx, in the mix.
+# Prompt template: Mistral
 ```
+<s>[INST] {input} [/INST] {output}</s>
+```