Edit model card

Medtulu-2x7b

Medtulu-2x7b is a Mixure of Experts (MoE) made with the following models:

🧩 Configuration

base_model: Technoculture/MT7Bi-dpo
tokenizer_source: union
gate_mode: hidden
dtype: bfloat16
experts:
  - source_model: Technoculture/MT7Bi-dpo
    positive_prompts:
      - "Are elevated serum levels of interleukin 21 associated with disease severity in patients with psoriasis?"
      - "Which one of the following does NOT present antigens?"
      - "A 25-year-old male patient presents to your clinic in significant distress. He states he has excruciating, stabbing pain around the left side of his head, and his left eye will not stop tearing. These types of headaches have been occurring for the past week every morning when he awakens and last around 60 minutes. He denies any aura, nausea, or vomiting. He denies any other past medical history. What is this patient's diagnosis?"
      - "When using an inhaler, when should a patient be asked to rinse their mouth?"
      - "What is the embryological origin of the hyoid bone?"
      - "After what period of time does maximal dynamic exercise become predominantly aerobic?"
  - source_model: allenai/tulu-2-dpo-7b
    positive_prompts:
      - "Who composed the tune of 'Twinkle, Twinkle, Little Star'?"
      - "Gem went to get new supplies for her hamster and she found snacks and exercise balls She chose the _ because her hamster was fat."
      - "John orders food for a massive restaurant. He orders 1000 pounds of beef for $8 per pound. He also orders twice that much chicken at $3 per pound. How much did everything cost?"
      - "The gravitational force of the Sun affects the planets in our solar system. Which of these is influenced the most by this force?"
      - "2sin(x) + yz ="
      - "Hobbies and Crafts"

Evaluations

Benchmark Medtulu-2x7b Orca-2-7b llama-2-7b meditron-7b meditron-70b
MedMCQA
ClosedPubMedQA
PubMedQA
MedQA
MedQA4
MedicationQA
MMLU Medical
MMLU
TruthfulQA
GSM8K
ARC
HellaSwag
Winogrande

More details on the Open LLM Leaderboard evaluation results can be found here.

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Technoculture/Medtulu-2x7b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month
3,164
Safetensors
Model size
11.1B params
Tensor type
BF16
Β·

Collection including Technoculture/Medtulu-2x7b