metadata

license: apache-2.0
tags:
  - moe
  - frankenmoe
  - merge
  - mergekit
  - Himitsui/Kaiju-11B
  - Sao10K/Fimbulvetr-11B-v2
  - decapoda-research/Antares-11b-v2
  - beberik/Nyxene-v3-11B
base_model:
  - Himitsui/Kaiju-11B
  - Sao10K/Fimbulvetr-11B-v2
  - decapoda-research/Antares-11b-v2
  - beberik/Nyxene-v3-11B
model-index:
  - name: Umbra-v3-MoE-4x11b
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 68.43
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 87.83
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 65.99
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 69.3
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 83.9
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 63.08
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
          name: Open LLM Leaderboard

Creator: SteelSkull

About Umbra-v3-MoE-4x11b: A Mixture of Experts model designed for general assistance with a special knack for storytelling and RP/ERP

Integrates models from notable sources for enhanced performance in diverse tasks.

Source Models:

Update-Log:

The [Umbra Series] keeps rolling out from the [Lumosia Series] garage, aiming to be your digital Alfred with a side of Shakespeare for those RP/ERP nights.

What's Fresh in v3?

Didn’t reinvent the wheel, just slapped on some fancier rims. Upgraded the models and tweaked the prompts a bit. Now, Umbra's not just a general use LLM; it's also focused on spinning stories and "Stories".

Negative Prompt Minimalism

Got the prompts to do a bit of a diet and gym routine—more beef on the positives, trimming down the negatives as usual with a dash of my midnight musings.

Still Guessing, Aren’t We?

Just so we're clear, "v3" is not the messiah of updates. It’s another experiment in the saga.

Dive into Umbra v3 and toss your two cents my way. Your feedback is the caffeine in my code marathon.

Exl2 available by:

EXL2-Rpcal = AzureBlack

GGUF = mradermacher

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	73.09
AI2 Reasoning Challenge (25-Shot)	68.43
HellaSwag (10-Shot)	87.83
MMLU (5-Shot)	65.99
TruthfulQA (0-shot)	69.30
Winogrande (5-shot)	83.90
GSM8k (5-shot)	63.08