Instructions to use EpistemeAI/OpenMedResearch-Gemma-4E4N with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EpistemeAI/OpenMedResearch-Gemma-4E4N with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="EpistemeAI/OpenMedResearch-Gemma-4E4N")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("EpistemeAI/OpenMedResearch-Gemma-4E4N")
model = AutoModelForMultimodalLM.from_pretrained("EpistemeAI/OpenMedResearch-Gemma-4E4N")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use EpistemeAI/OpenMedResearch-Gemma-4E4N with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "EpistemeAI/OpenMedResearch-Gemma-4E4N"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EpistemeAI/OpenMedResearch-Gemma-4E4N",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/EpistemeAI/OpenMedResearch-Gemma-4E4N

SGLang

How to use EpistemeAI/OpenMedResearch-Gemma-4E4N with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "EpistemeAI/OpenMedResearch-Gemma-4E4N" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EpistemeAI/OpenMedResearch-Gemma-4E4N",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "EpistemeAI/OpenMedResearch-Gemma-4E4N" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EpistemeAI/OpenMedResearch-Gemma-4E4N",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Unsloth Studio

How to use EpistemeAI/OpenMedResearch-Gemma-4E4N with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EpistemeAI/OpenMedResearch-Gemma-4E4N to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for EpistemeAI/OpenMedResearch-Gemma-4E4N to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for EpistemeAI/OpenMedResearch-Gemma-4E4N to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="EpistemeAI/OpenMedResearch-Gemma-4E4N",
    max_seq_length=2048,
)

Docker Model Runner
How to use EpistemeAI/OpenMedResearch-Gemma-4E4N with Docker Model Runner:
```
docker model run hf.co/EpistemeAI/OpenMedResearch-Gemma-4E4N
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

OpenMedResearch-Gemma-4E4N

Model Summary

EpistemeAI/OpenMedResearch-Gemma-4E4N is an open biomedical research model fine-tuned from google/gemma-4-E4B using the jmhb/PaperSearchQA dataset.

The model is designed for biomedical question answering, scientific literature reasoning, PubMed-style paper search, research assistant workflows, and retrieval-augmented medical research experiments. It is intended to help answer factual biomedical questions by reasoning over scientific literature rather than providing direct clinical advice.

This model is for research and development use only. It is not intended to directly provide clinical diagnosis, patient management decisions, treatment recommendations, medication dosing, or emergency medical guidance.

Safety Notice: This model is for benign medical and scientific reasoning only. It must not be used for biological or chemical weapon development, pathogen enhancement, toxin production, hazardous synthesis, or any activity that enables harm. All biomedical, biological, chemical, or laboratory-related outputs require expert review and must comply with applicable legal, ethical, biosafety, biosecurity, and chemical safety standards.

Model Type

This model is based on Gemma 4 E4B, a multimodal Transformer model from the Gemma 4 family.

The base model uses:

Base model: google/gemma-4-E4B
Architecture: Gemma4ForConditionalGeneration
Top-level model_type: gemma4
Text submodule model_type: gemma4_text
Vision submodule model_type: gemma4_vision
Audio submodule model_type: gemma4_audio
Task family: multimodal conditional generation
Supported input modalities: text, image, and audio
Output modality: text
Context length: up to 128K tokens
Vocabulary size: 262,144 tokens

Intended Use

This model may be useful for:

Biomedical research question answering
PubMed-style scientific paper search
Retrieval-augmented biomedical QA
Scientific literature exploration
Evidence-grounded research assistant workflows
Medical and biological factoid QA
Research summarization and hypothesis exploration
Biomedical education support
Scientific search-agent experimentation

Out-of-Scope Use

This model should not be used for:

Direct clinical diagnosis
Direct treatment planning
Medication dosage recommendations
Emergency medical decision-making
Autonomous clinical triage
Replacing licensed medical professionals
Making final decisions from medical images, audio, or patient data
High-stakes patient management without expert review

All outputs should be treated as preliminary research assistance, require independent verification, and should be reviewed by qualified professionals before any real-world medical or clinical application.

Training Dataset

This model was fine-tuned using:

Dataset: jmhb/PaperSearchQA
Dataset type: biomedical scientific question-answering dataset
Language: English
Dataset license: MIT
Domain: biomedical literature, medicine, biology, and PubMed abstracts
Format: question-answer pairs with source attribution
Task category: question answering
Approximate size: 60,000 QA examples

PaperSearchQA is a biomedical QA dataset designed for training and evaluating search agents that reason over scientific literature. It contains question-answer pairs generated from PubMed abstracts and is intended for retrieval-augmented biomedical question answering.

The dataset includes:

Training split: 54,907 examples
Test split: 5,000 examples
Total examples: 59,907 examples
Retrieval corpus: approximately 16 million PubMed abstracts
Source attribution through PubMed IDs
Multiple acceptable answer variants for exact-match evaluation
Biomedical category labels across 10 biomedical domains

Training Procedure

The model may include one or more of the following training stages:

Supervised Fine-Tuning

The model is fine-tuned on biomedical question-answer examples from jmhb/PaperSearchQA.
Scientific QA Optimization

The model is trained to improve factual biomedical answer generation, research-question understanding, and scientific literature reasoning.
Retrieval-Augmented Reasoning

The model is intended to support workflows where retrieved PubMed abstracts or scientific passages are provided as context before answer generation.
Search-Agent or RLVR Training

PaperSearchQA is designed for search-and-reasoning tasks over scientific papers. Additional training may include reinforcement learning with verifiable rewards, search-agent rollouts, or exact-match reward objectives.
Safety and Research Alignment

Optional preference tuning may be used to reduce hallucinated citations, overconfident medical claims, unsupported biological claims, and unsafe clinical advice.
Evaluation and Checkpoint Selection

Candidate checkpoints should be evaluated on biomedical QA benchmarks, retrieval-augmented QA tasks, hallucination tests, source-grounding tests, and medical safety regression tests before release.

Safety Alignment

The model should be aligned to prefer responses that:

Distinguish research information from clinical advice
Cite or reference provided evidence when available
Express uncertainty when evidence is incomplete
Avoid unsupported medical claims
Avoid presenting outputs as definitive diagnoses
Recommend professional medical consultation for serious symptoms
Avoid prescription, medication dosage, or treatment instructions
Refuse unsafe medical, biological, or harmful instructions
Provide safe educational alternatives when refusing unsafe requests

Recommended Retrieval-Augmented Prompt Format

You are a biomedical research assistant. Use the provided scientific context to answer the question.

Rules:
- Answer using only the provided context when possible.
- If the context is insufficient, say that the evidence is insufficient.
- Do not invent citations, PMIDs, paper titles, or experimental results.
- Do not provide clinical diagnosis, medication dosage, or treatment instructions.
- Keep the answer concise and evidence-grounded.

Question:
{question}

Retrieved scientific context:
{retrieved_pubmed_abstracts_or_passages}

Answer:

Installation

pip install -U transformers accelerate torch

Example Usage

from transformers import AutoProcessor, AutoModelForMultimodalLM
import torch

model_id = "EpistemeAI/OpenMedResearch-Gemma-4E4N"

processor = AutoProcessor.from_pretrained(model_id)

model = AutoModelForMultimodalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

messages = [
    {
        "role": "system",
        "content": [
            {
                "type": "text",
                "text": (
                    "You are a biomedical research assistant. "
                    "Answer research questions using evidence-grounded reasoning. "
                    "Do not provide clinical diagnosis, prescription, dosage, or treatment plans."
                )
            }
        ]
    },
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": (
                    "What protein is commonly associated with Duchenne muscular dystrophy? "
                    "Answer as a biomedical factoid QA question."
                )
            }
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.2,
        top_p=0.9,
        do_sample=True
    )

print(processor.decode(outputs[0], skip_special_tokens=True))

Text-Only Research QA Example

from transformers import AutoProcessor, AutoModelForMultimodalLM
import torch

model_id = "EpistemeAI/OpenMedResearch-Gemma-4E4N"

processor = AutoProcessor.from_pretrained(model_id)

model = AutoModelForMultimodalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

question = "Which immunoglobulin class is commonly tested in assays detecting antibodies against cytomegalovirus?"

context = """
Retrieved context:
Evaluation of immunoglobulin G preparations for anti-cytomegalovirus antibodies with reference to neutralizing antibody in the presence of complement.
"""

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": f"""
You are a biomedical research QA assistant.
Use the provided context to answer the question.
If the evidence is insufficient, say so.

Question:
{question}

Context:
{context}

Answer:
"""
            }
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=128,
        temperature=0.0,
        do_sample=False
    )

print(processor.decode(outputs[0], skip_special_tokens=True))

Recommended Medical Safety Behavior

For biomedical and medical research questions, the model should:

Provide research-oriented information
Use retrieved evidence when available
Avoid inventing citations or PMIDs
Explain uncertainty and limitations
Avoid definitive clinical diagnosis
Avoid prescription or medication dosage advice
Recommend professional medical care when appropriate
Avoid unsupported claims
Avoid making final clinical decisions from incomplete information

Evaluation

The model should be evaluated on both scientific QA capability and safety.

Suggested evaluation categories:

Category	Example Evaluation
Biomedical QA	PaperSearchQA test split
Retrieval-augmented QA	PubMed abstract retrieval + answer generation
Exact-match QA	Golden answer / synonym match
Source grounding	Whether answers are supported by retrieved abstracts
Hallucination	Citation, PMID, and factual consistency checks
Medical safety	Unsafe diagnosis, treatment, and dosage prompts
Calibration	Uncertainty when evidence is insufficient
Research usefulness	Clarity, concision, and evidence-grounded response quality

Limitations

This model may:

Produce incorrect biomedical information
Generate plausible but unsupported claims
Invent citations, PMIDs, or paper details if not constrained
Overstate confidence when evidence is incomplete
Fail to retrieve or use the most relevant scientific context
Miss recent findings not present in training or retrieval data
Reflect limitations or biases from the base model and training data
Misinterpret medical images, audio, or multimodal inputs
Provide incomplete or outdated scientific summaries

The model is not a substitute for professional medical judgment, systematic literature review, or expert scientific review.

Medical and Research Disclaimer

The outputs generated by this model are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice application.

The model is intended for biomedical research assistance and scientific question answering. Generated outputs may be incomplete, outdated, or inaccurate. All outputs should be independently verified against reliable scientific sources and reviewed by qualified experts before use in research, medical, clinical, or regulatory settings.

If you are experiencing a medical emergency, contact emergency services or a qualified healthcare professional immediately.

Ethical Considerations

Biomedical AI systems require careful evaluation, human oversight, transparent limitations, and responsible deployment. This model should not be used in workflows where incorrect outputs could directly harm patients, mislead researchers, or support unsafe biological activity.

Developers should evaluate the model for:

Biomedical hallucination
Unsupported scientific claims
Citation and PMID fabrication
Overconfident medical statements
Unsafe treatment advice
Privacy leakage
Bias across patient populations and research domains
Unsafe biological or clinical instructions
Failure to recommend urgent care when appropriate
Multimodal misinterpretation risk

Dataset Citation

@misc{burgess2026papersearchqalearningsearchreason,
  title={PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR},
  author={James Burgess and Jan N. Hansen and Duo Peng and Yuhui Zhang and Alejandro Lozano and Min Woo Sun and Emma Lundberg and Serena Yeung-Levy},
  year={2026},
  eprint={2601.18207},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2601.18207}
}

Base Model Citation

@misc{gemma4e4b,
  title={Gemma 4 E4B},
  author={Google DeepMind},
  year={2026},
  publisher={Hugging Face},
  note={Base model: google/gemma-4-E4B}
}

Model Citation

@misc{openmedresearchgemma4e4n,
  title={OpenMedResearch-Gemma-4E4N},
  author={EpistemeAI},
  year={2026},
  publisher={Hugging Face},
  note={Fine-tuned from google/gemma-4-E4B using jmhb/PaperSearchQA}
}

License

This model is released under the Apache-2.0 license unless otherwise specified.

The training dataset jmhb/PaperSearchQA is released under the MIT license. Users are responsible for ensuring that their use complies with the base model license, dataset license, and applicable laws or regulations.