Chickaboo/Chicka-Mixtral-3x7b

Model Description

This model is a Mixture of Experts merged LLM consisting of 3 mistral based models:

base model/conversational expert, openchat/openchat-3.5-0106

code expert, beowolx/CodeNinja-1.0-OpenChat-7B

math expert, meta-math/MetaMath-Mistral-7B

This is the Mergekit config used in the merging process:

base_model: openchat/openchat-3.5-0106
experts:
  - source_model: openchat/openchat-3.5-0106
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
    - "I want"
  - source_model: beowolx/CodeNinja-1.0-OpenChat-7B
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
    - "C#"
    - "C++"
    - "debug"
    - "runtime"
    - "html"
    - "command"
    - "nodejs"
  - source_model: meta-math/MetaMath-Mistral-7B
    positive_prompts:
    - "reason"
    - "math"
    - "mathematics"
    - "solve"
    - "count"
    - "calculate"
    - "arithmetic"
    - "algebra"

Open LLM Leaderboards

Benchmark	Chicka-Mixtral-3X7B	Mistral-7B-Instruct-v0.2	Meta-Llama-3-8B
Average	69.19	60.97	62.55
ARC	64.08	59.98	59.47
Hellaswag	83.96	83.31	82.09
MMLU	64.87	64.16	66.67
TruthfulQA	50.51	42.15	43.95
Winogrande	81.06	78.37	77.35
GSM8K	70.66	37.83	45.79

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/Chicka-Mistral-3x7b")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/Chicka-Mixtral-3x7b")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Chickaboo
/

Chicka-Mixtral-3x7b

Model Description

Open LLM Leaderboards

Usage

Model tree for Chickaboo/Chicka-Mixtral-3x7b