MedMCQ — Genetics Topic Classifier (Qwen3-0.6B)
A small fine-tuned Qwen3 model that takes a Genetics medical multiple-choice question (MCQ) and predicts which topic within Genetics the question is about.
This is a per-subject topic classifier — the second hop in the MedMCQ three-hop pipeline. It assumes the input MCQ has already been routed to Genetics by the subject classifier.
The MedMCQ pipeline
The MedMCQ project explores small, specialized models for medical MCQs. Instead of using one large model for everything, it splits the task across three hops:
- Subject routing — the subject classifier picks the medical subject.
- Topic classification — this model. Given a Genetics MCQ, pick the topic within Genetics.
- Answer generation — the Genetics generator produces the answer / new MCQs in Genetics.
Each hop is a separate, narrow model. They are all published under the MedMCQ Medical Models collection.
Quick start
from transformers import AutoTokenizer, AutoModelForCausalLM
repo = "stravoris/medmcq-genetics-classifier-qwen3-0.6b"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo)
prompt = """Classify the following Genetics MCQ by topic.
Question: <a Genetics MCQ here>
A) <option A>
B) <option B>
C) <option C>
D) <option D>
Topic:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=16, do_sample=False)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Prompt format
The model expects MCQs in this exact form:
Classify the following Genetics MCQ by topic.
Question: <question text>
A) <option A>
B) <option B>
C) <option C>
D) <option D>
Topic:
The model completes the prompt with one of the topic labels seen during training. The full topic list is in the dataset card.
What this model is not
This is a sample model for demonstration. It is not a production-grade medical AI system:
- It has not been formally evaluated against board-level benchmarks.
- It should not be used to make clinical decisions or provide medical advice.
- It is narrow: it only understands Genetics MCQs and assumes the upstream subject router did its job.
- It was trained on a curated educational dataset and inherits any biases or gaps in that data.
The MedMCQ project exists to explore small-model pipeline architectures for medical reasoning, not to ship a medical product.
Training data
Trained on the Genetics subset of the Stravoris Medical MCQ dataset — educational Genetics MCQs labeled by topic.
Base model
Fine-tuned from Qwen/Qwen3-0.6B. The base model's license and usage terms also apply.
License
Apache 2.0. See LICENSE.
Collection
Part of the MedMCQ Medical Models collection — all 31 models that make up the MedMCQ three-hop pipeline.
- Downloads last month
- 31