GodSlayer-12B-ABYSS / README.md
redrix's picture
Upload README.md
b602ebc verified
metadata
base_model:
  - LatitudeGames/Wayfarer-12B
  - ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
  - PocketDoc/Dans-PersonalityEngine-V1.1.0-12b
  - HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
  - TheDrummer/UnslopNemo-12B-v4
  - romaingrx/red-teamer-mistral-nemo
  - DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
  - rAIfle/Questionable-MN-bf16
  - allura-org/MN-12b-RP-Ink
  - IntervitensInc/Mistral-Nemo-Base-2407-chatml
library_name: transformers
tags:
  - mergekit
  - merge
  - 12b
  - chat
  - roleplay
  - creative-writing
  - DELLA-linear

GodSlayer-12B-ABYSS

This is a merge of pre-trained language models created using mergekit.

The goal of this model is to remain fairly stable and coherent, while counteracting positivity-bias and improving realism and diverse responses.

Model #12

Merge Details

Merge Method

This model was merged using the NuSLERP merge method using IntervitensInc/Mistral-Nemo-Base-2407-chatml as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configurations were used to produce this model:

# P1:
models:
  - model: PocketDoc/Dans-PersonalityEngine-V1.1.0-12b
    parameters:
      weight:
        - filter: self_attn
          value: 0.2
        - filter: mlp
          value: 0.2
        - value: 0.2
      density: 0.6
  - model: ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
    parameters:
      weight:
        - filter: self_attn
          value: 0.15
        - filter: mlp
          value: 0.15
        - value: 0.15
      density: 0.55
  - model: HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
    parameters:
      weight:
        - filter: self_attn
          value: 0.1
        - filter: mlp
          value: 0.1
        - value: 0.1
      density: 0.5
  - model: LatitudeGames/Wayfarer-12B
    parameters:
      weight:
        - filter: self_attn
          value: 0.25
        - filter: mlp
          value: 0.25
        - value: 0.25
      density: 0.65
base_model: TheDrummer/UnslopNemo-12B-v4
merge_method: della_linear
dtype: bfloat16
chat_template: "chatml"
tokenizer_source: union
parameters:
  normalize: true
  int8_mask: true
  epsilon: 0.1
  lambda: 1
# P2:
models:
  - model: rAIfle/Questionable-MN-bf16
    parameters:
      weight:
        - filter: self_attn
          value: 0.2
        - filter: mlp
          value: 0.2
        - value: 0.2
      density: 0.6
  - model: DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
    parameters:
      weight:
        - filter: self_attn
          value: 0.3
        - filter: mlp
          value: 0.3
        - value: 0.3
      density: 0.7
  - model: allura-org/MN-12b-RP-Ink
    parameters:
      weight:
        - filter: self_attn
          value: 0.35
        - filter: mlp
          value: 0.35
        - value: 0.35
      density: 0.75
  - model: romaingrx/red-teamer-mistral-nemo
    parameters:
      weight:
        - filter: self_attn
          value: 0.25
        - filter: mlp
          value: 0.25
        - value: 0.25
      density: 0.65
base_model: TheDrummer/UnslopNemo-12B-v4
merge_method: della_linear
dtype: bfloat16
chat_template: "chatml"
tokenizer_source: union
parameters:
  normalize: true
  int8_mask: true
  epsilon: 0.1
  lambda: 1
# Final:
models:
  - model: P1
    parameters:
      weight: 0.5
  - model: P2
    parameters:
      weight: 0.5
base_model: IntervitensInc/Mistral-Nemo-Base-2407-chatml
merge_method: nuslerp
dtype: bfloat16
chat_template: "chatml"
tokenizer:
  source: union
parameters:
  normalize: true
  int8_mask: true