Edit model card

The Moe model built on top of Qwen1.5-7B-Chat, Qwen1.5-7B and Crystalcareai/CrystalQwen-1.5-7B, Then qlora was applied to all layers of q,v, and gate linear on WizardLM_evol_instruct_70k via mlx. The model was created using a script from https://github.com/mzbac/mlx-moe

Evaluation

Qwen-1_5-2x3-hf

MMLU

Groups Version Filter n-shot Metric Value Stderr
- humanities N/A none 0 acc 0.6488 ± 0.0237
- other N/A none 0 acc 0.6294 ± 0.0302
- social_sciences N/A none 0 acc 0.6905 ± 0.0281
- stem N/A none 0 acc 0.5227 ± 0.0375

CMMLU

Groups Version Filter n-shot Metric Value Stderr
cmmlu N/A none 0 acc 0.6966 ± 0.0333
none 0 acc_norm 0.6966 ± 0.0333

GSM8K

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 2 get-answer 5 exact_match 0.4102 ± 0.0135

Qwen1.5-7B-Chat

MMLU

Groups Version Filter n-shot Metric Value Stderr
- humanities N/A none 0 acc 0.6533 ± 0.0239
- other N/A none 0 acc 0.6321 ± 0.0301
- social_sciences N/A none 0 acc 0.6934 ± 0.0282
- stem N/A none 0 acc 0.5329 ± 0.0376

CMMLU

Groups Version Filter n-shot Metric Value Stderr
cmmlu N/A none 0 acc 0.6879 ± 0.0338
none 0 acc_norm 0.6879 ± 0.0338

GSM8K

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 2 get-answer 5 exact_match 0.0425 ± 0.0056
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "mzbac/qwen-1.5-2x3-hf"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    load_in_4bit=True,
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

chat = [
    {"role": "user", "content": "how backpropagation works?"},
    {"role": "assistant", "content": "\n"},
]

text = tokenizer.apply_chat_template(chat, tokenize=False)

inputs = tokenizer.encode(text, return_tensors="pt").to("cuda")

generate_kwargs = dict(
    input_ids=inputs,
    temperature=0.6,
    max_new_tokens=500,
    do_sample=True,
)

outputs = model.generate(**generate_kwargs)
print(tokenizer.decode(outputs[0]))
Downloads last month
0
Safetensors
Model size
16.4B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.