--- base_model: - grimjim/mistralai-Mistral-Nemo-Base-2407 - grimjim/magnum-consolidatum-v1-12b - nbeerbower/Mistral-Nemo-Prism-12B - TheDrummer/Rocinante-12B-v1.1 - grimjim/magnum-twilight-12b - grimjim/mistralai-Mistral-Nemo-Instruct-2407 library_name: transformers pipeline_tag: text-generation tags: - mergekit - merge license: apache-2.0 --- # Magnolia-v3-12B This repo contains a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). This merge takes advantage of task arithmetic to incorporate the influence of two additional models at low weight in order to improve default creative outputs. Tested at temperature 1.0 with minP 0.01; that was it. Mistral prompt formats for Nemo should work, although I successfully tested with a variant: ``` User message prefix: [INST]user User message suffix: [/INST] Assistant message prefix: [INST]assistant Assistant message suffix: [/INST] ``` Testing with a blank sysprompt was not awful, though a good sysprompt will elicit more desired behavior. ## Merge Details ### Merge Method This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/mistralai-Mistral-Nemo-Base-2407](https://huggingface.co/grimjim/mistralai-Mistral-Nemo-Base-2407) as a base. ### Models Merged The following models were included in the merge: * [grimjim/magnum-consolidatum-v1-12b](https://huggingface.co/grimjim/magnum-consolidatum-v1-12b) * [nbeerbower/Mistral-Nemo-Prism-12B](https://huggingface.co/nbeerbower/Mistral-Nemo-Prism-12B) * [TheDrummer/Rocinante-12B-v1.1](https://huggingface.co/TheDrummer/Rocinante-12B-v1.1) * [grimjim/magnum-twilight-12b](https://huggingface.co/grimjim/magnum-twilight-12b) * [grimjim/mistralai-Mistral-Nemo-Instruct-2407](https://huggingface.co/grimjim/mistralai-Mistral-Nemo-Instruct-2407) ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: grimjim/mistralai-Mistral-Nemo-Base-2407 dtype: bfloat16 merge_method: task_arithmetic parameters: normalize: true slices: - sources: - layer_range: [0, 40] model: grimjim/mistralai-Mistral-Nemo-Base-2407 - layer_range: [0, 40] model: grimjim/mistralai-Mistral-Nemo-Instruct-2407 parameters: weight: 0.9 - layer_range: [0, 40] model: grimjim/magnum-consolidatum-v1-12b parameters: weight: 0.1 - layer_range: [0, 40] model: grimjim/magnum-twilight-12b parameters: weight: 0.001 - layer_range: [0, 40] model: TheDrummer/Rocinante-12B-v1.1 parameters: weight: 0.001 - layer_range: [0, 40] model: nbeerbower/Mistral-Nemo-Prism-12B parameters: weight: 0.05 ```