Question Answering
Safetensors
moellama
custom_code

🫐 Moecule 3x3B M10 FKS

logo

Model Details

This model is a mixture of experts (MoE) using the RhuiDih/moetify library with various task-specific experts. All relevant expert models, LoRA adapters, and datasets are available at Moecule Ingredients.

Key Features

  • Zero Additional Training: Combine existing domain-specific / task-specific experts into a powerful MoE model without additional training!

System Requirements

Steps System Requirements
MoE Creation > 54.2 GB System RAM
Inference (fp16) GPU with > 15.5GB VRAM

MoE Creation

To reproduce this model, run the following command:

# git clone moetify fork that fixes dependency issue
!git clone -b fix-transformers-4.47.1-FlashA2-dependency --single-branch https://github.com/davzoku/moetify.git

!cd moetify && pip install -e .

 python -m moetify.mix \
   --output_dir ./moecule-3x3b-m10-fks \
   --model_path  unsloth/llama-3.2-3b-Instruct \
   --modules mlp q_proj \
   --ingredients \
       davzoku/finqa_expert_3b \
       davzoku/kyc_expert_3b \
       davzoku/stock_market_expert_3b

Model Parameters

INFO:root:Stem parameters: 1228581888
INFO:root:Experts parameters: 7134511104
INFO:root:Routers parameters: 516096
INFO:root:MOE total parameters (numel): 8363609088
INFO:root:MOE total parameters : 8363609088
INFO:root:MOE active parameters: 5985438720

Inference

To run an inference with this model, you can use the following code snippet:

# git clone moetify fork that fixes dependency issue
!git clone -b fix-transformers-4.47.1-FlashA2-dependency --single-branch https://github.com/davzoku/moetify.git

!cd moetify && pip install -e .

model = AutoModelForCausalLM.from_pretrained(<model-name>, device_map='auto', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(<model-name>)

def format_instruction(row):
    return f"""### Question: {row}"""

greedy_generation_config = GenerationConfig(
    temperature=0.1,
    top_p=0.75,
    top_k=40,
    num_beams=1,
    max_new_tokens=128,
    repetition_penalty=1.2
)


input_text = "In what ways did Siemens's debt restructuring on March 06, 2024 reflect its strategic priorities?"
formatted_input = format_instruction(input_text)
inputs = tokenizer(formatted_input, return_tensors="pt").to('cuda')

with torch.no_grad():
    outputs = model.generate(
        input_ids=inputs.input_ids,
        attention_mask=inputs.attention_mask,
        generation_config=greedy_generation_config
    )

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

The Team

  • CHOCK Wan Kee
  • Farlin Deva Binusha DEVASUGIN MERLISUGITHA
  • GOH Bao Sheng
  • Jessica LEK Si Jia
  • Sinha KHUSHI
  • TENG Kok Wai (Walter)

References

Downloads last month
9
Safetensors
Model size
8.36B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for davzoku/moecule-3x3b-m10-fks

Finetuned
(215)
this model

Datasets used to train davzoku/moecule-3x3b-m10-fks

Collection including davzoku/moecule-3x3b-m10-fks