Edit model card

ExpertRamonda-7Bx2_MoE

ExpertRamonda-7Bx2_MoE is a Mixure of Experts (MoE) made with the following models using LazyMergekit:

πŸ† Benchmarks

Open LLM Leaderboard

Model Average ARC_easy HellaSwag MMLU TruthfulQA_mc2 Winogrande GSM8K
mayacinka/ExpertRamonda-7Bx2_MoE 78.10 86.87 87.51 61.63 78.02 81.85 72.71

MMLU

Groups Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.6163 Β± 0.0039
- humanities N/A none None acc 0.5719 Β± 0.0067
- other N/A none None acc 0.6936 Β± 0.0079
- social_sciences N/A none None acc 0.7121 Β± 0.0080
- stem N/A none None acc 0.5128 Β± 0.0085

🧩 Configuration

base_model: mlabonne/AlphaMonarch-7B
gate_mode: hidden 
dtype: bfloat16 
experts_per_token: 2
experts:
  - source_model: mlabonne/AlphaMonarch-7B
    positive_prompts:
      - "You excel at reasoning skills. For every prompt you think of an answer from 3 different angles"
    ## (optional)
    # negative_prompts:
    #   - "This is a prompt expert_model_1 should not be used for"
  - source_model: bardsai/jaskier-7b-dpo-v5.6
    positive_prompts:
      - "You excel at logic and reasoning skills. Reply in a straightforward and concise way"

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mayacinka/ExpertRamonda-7Bx2_MoE"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month
6
Safetensors
Model size
12.9B params
Tensor type
BF16
Β·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Merge of

Collection including mayacinka/ExpertRamonda-7Bx2_MoE