Edit model card

Llama-Salad-4x8B-V3

Changes in V3:

  • Uses L3-8B-Stheno-v3.2 as the base model instead of Meta-Llama-3-8B-Instruct
  • Removed opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5 and added Einstein-v6.1-Llama3-8B
  • Swapped Llama-3-Soliloquy-8B-v2 for L3-8B-Stheno-v3.2

I was clearly wrong when I said V2 would be difficult to improve on, because V3 is significantly better in just about every aspect. Stheno-v3.2 fixed all of the issues present in Stheno-v3.1, making it my favorite roleplay model and the best base model for llama-3 MoE merges.

The one thing I do want to improve on is finding a better conversational model than Meta-Llama-3-8B-Instruct; it's good for that use case, but I'm sure there's a better one out there. I tried using llama-3-cat-8b-instruct-v1, but it absolutely tanked the model's situational awareness and kept making blatantly contradictory statements.

Quantization Formats

GGUF

Details

Models Used

Merge Config

base_model: Sao10K/L3-8B-Stheno-v3.2
gate_mode: hidden
dtype: bfloat16
experts_per_token: 2
experts:
  - source_model: NousResearch/Meta-Llama-3-8B-Instruct
    positive_prompts:
    - "chat"
    - "conversation"
  - source_model: Weyaxi/Einstein-v6.1-Llama3-8B
    positive_prompts:
    - "science"
    - "physics"
    - "chemistry"
    - "biology"
    - "math"
    - "step-by-step"
    - "logical reasoning"
    - "multilingual"
    - "translation"
    - "language translation"
    - "foreign language"
    negative_prompts:
    - "programming language"
  - source_model: migtissera/Llama-3-8B-Synthia-v3.5
    positive_prompts:
    - "summarize"
    - "paraphrase"
    - "list"
    - "explain"
    - "define"
    - "analyze"
    - "rephrase"
    - "elaborate"
    - "programming language"
    - "JavaScript"
    - "Python programming language"
    - "Rust programming language"
    - "C++ programming language"
    - "GO programming language"
    - "Ruby programming language"
    - "Haskell programming language"
    - "SQL query language"
    - "CSS markup styling language"
    - "code"
  - source_model: Sao10K/L3-8B-Stheno-v3.2
    positive_prompts:
    - "characters"
    - "scene"
    - "roleplay"
    - "erotic roleplay"
    - "sexual fetish"
    - "NSFW"
    - "creative writing"
    - "storytelling"
    - "narration"
    - "narrative setting"
    - "narrative plot"
    - "narrative exposition"
    - "narrative theme"
    - "narrative climax"

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 24.75
IFEval (0-Shot) 66.54
BBH (3-Shot) 31.93
MATH Lvl 5 (4-Shot) 8.53
GPQA (0-shot) 7.05
MuSR (0-shot) 6.45
MMLU-PRO (5-shot) 27.98
Downloads last month
18
Safetensors
Model size
24.9B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including HiroseKoichi/Llama-Salad-4x8B-V3

Evaluation results