gemma4-12b-bioinfo

gemma4-12b-bioinfo is a fine-tuned Gemma 4 12B instruction model for bioinformatics, genomics, and computational biology question answering.

The LoRA adapter was merged into the base model, so this repository is intended for direct use with Hugging Face transformers.

For the optimized local-inference GGUF files, use: yashm/gemma4-12b-bioinfo-GGUF.

Intended Use

This model is intended for research, education, and computational biology assistance. It is not a medical device and should not be used for clinical diagnosis, treatment decisions, or professional medical advice.

Model Details

  • Base model: google/gemma-4-12B-it
  • Fine-tuning method: QLoRA / SFT, merged into full model
  • Domain: bioinformatics, genomics, transcriptomics, proteomics, sequence analysis, biological databases, and common bioinformatics tools
  • Primary format: Hugging Face transformers
  • GGUF format: available in yashm/gemma4-12b-bioinfo-GGUF

Quick Start: Transformers

Gemma 4 uses AutoModelForImageTextToText in this notebook, not AutoModelForCausalLM.

import torch
from transformers import AutoTokenizer, AutoModelForImageTextToText

repo_id = "yashm/gemma4-12b-bioinfo"

tokenizer = AutoTokenizer.from_pretrained(
    repo_id,
    trust_remote_code=True,
)

model = AutoModelForImageTextToText.from_pretrained(
    repo_id,
    device_map="auto",
    dtype=torch.bfloat16,
    attn_implementation="eager",
    trust_remote_code=True,
)

system_prompt = (
    "You are an expert bioinformatics assistant with deep knowledge of "
    "genomics, proteomics, transcriptomics, sequence analysis, biological "
    "databases, and bioinformatics tools. Provide accurate, concise, and "
    "scientifically rigorous answers."
)

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Explain the difference between local and global sequence alignment."},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    output_ids = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.2,
        top_p=0.9,
        repetition_penalty=1.1,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

new_tokens = output_ids[0, inputs["input_ids"].shape[-1]:]
answer = tokenizer.decode(new_tokens, skip_special_tokens=True).strip()
print(answer)

Recommended Generation Settings

  • temperature=0.2 for factual bioinformatics answers
  • top_p=0.9
  • repetition_penalty=1.1
  • max_new_tokens=512

Limitations

The model may produce incorrect or incomplete biological interpretations. Always verify outputs against trusted scientific literature, databases, and domain experts.

Citation

@misc{gemma4-12b-bioinfo_2026,
  author       = {yashm},
  title        = {gemma4-12b-bioinfo: Fine-Tuned Gemma 4 12B for Bioinformatics},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/yashm/gemma4-12b-bioinfo}}
}
Downloads last month
120
Safetensors
Model size
12B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yashm/gemma4-12b-bioinfo

Finetuned
(48)
this model
Quantizations
1 model