Instructions to use RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor")
model = AutoModelForCausalLM.from_pretrained("RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor

SGLang

How to use RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor with Docker Model Runner:
```
docker model run hf.co/RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor
```

mistral-7b-instruct-v0.3-adjuvant-extractor

This model is a task-specific fine-tuned version of Mistral 7B Instruct v0.3 for extracting vaccine adjuvant concepts and associated evidence snippets from infectious disease biomedical abstracts.

Model Summary

Model name: mistral-7b-instruct-v0.3-adjuvant-extractor
Base model: mistralai/Mistral-7B-Instruct-v0.3
Fine-tuning method: LoRA adapter training, merged into full model weights
Primary task: Evidence-linked adjuvant extraction from title+abstract text

Prompt Used for Inference

System prompt

You are a biomedical information extraction assistant.

User instruction template

Extract infectious-disease adjuvants from the text and provide evidence snippets.
Return ONLY valid JSON in this format:
[{"adjuvant": "<string>", "evidence": "<string>"}, ...]
Do not include any extra keys or explanation.

Input format

Title: <paper title>
Abstract: <paper abstract>

The model receives the user instruction template followed by the title/abstract text.

Actual Output Format Target

The model is prompted to return a JSON array of objects with exactly two keys:

[
  {
    "adjuvant": "<string>",
    "evidence": "<string>"
  }
]

Expected behavior:

Return a JSON array (can be empty: []).
Each object must contain:
- adjuvant: normalized or near-normalized adjuvant name
- evidence: supporting text snippet from the same input abstract
No extra keys and no explanatory text outside JSON.

Input/Output Example

Example Input

Title: Intranasal vaccination study using alum and MPLA adjuvants in a murine influenza model.
Abstract: Mice immunized with antigen formulated with alum showed increased IgG titers. A separate group receiving MPLA-adjuvanted vaccine demonstrated stronger IFN-gamma responses and reduced viral load after challenge.

Expected Output

[
  {
    "adjuvant": "alum",
    "evidence": "Mice immunized with antigen formulated with alum showed increased IgG titers."
  },
  {
    "adjuvant": "MPLA",
    "evidence": "A separate group receiving MPLA-adjuvanted vaccine demonstrated stronger IFN-gamma responses and reduced viral load after challenge."
  }
]

Notes on Output Validity

Output must be valid JSON.
Output must be a JSON array (use [] if no supported adjuvant is found).
Each item should include only adjuvant and evidence.
Evidence text should come from the provided input abstract.

Working Inference Code (Validated)

import torch
import json
from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor"

SYS_PROMPT = "You are a biomedical information extraction assistant."
PROMPT_INSTRUCTION = (
    "Extract infectious-disease adjuvants from the text and provide evidence snippets.\n"
    "Return ONLY valid JSON in this format:\n"
    "[{\"adjuvant\": \"<string>\", \"evidence\": \"<string>\"}, ...]\n"
    "Do not include any extra keys or explanation."
)

title = "Protective immune response against Streptococcus pyogenes in mice after intranasal vaccination with the fibronectin-binding protein SfbI."
abstract = (
    "Despite the significant impact on human health caused by Streptococcus pyogenes, "
    "there is currently no vaccine available. Intranasal immunization of mice with either "
    "SfbI alone or coupled to cholera toxin B subunit (CTB) triggered efficient SfbI-specific responses."
)

user_input = f"{PROMPT_INSTRUCTION}\n\nTitle: {title}\nAbstract: {abstract}"

tokenizer = AutoTokenizer.from_pretrained(repo_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    dtype=torch.float16,
    device_map="auto",
)
model.eval()

chat = tokenizer.apply_chat_template(
    [
        {"role": "system", "content": SYS_PROMPT},
        {"role": "user", "content": user_input},
    ],
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(chat, return_tensors="pt", truncation=True, max_length=1024)
inputs = {k: v.to(model.get_input_embeddings().weight.device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id,
    )

prediction = tokenizer.decode(
    outputs[0][inputs["input_ids"].shape[1]:],
    skip_special_tokens=True
).strip()

print(json.dumps(json.loads(prediction), indent=2, ensure_ascii=False))

Intended Use

This model is intended for research workflows in biomedical literature mining, especially:

infectious disease vaccine literature curation
vaccine adjuvant concept extraction
evidence-linked information extraction for downstream manual review

This model is not intended for clinical decision-making.

Training Data and Split Context

The model was trained on a curated infectious disease adjuvant corpus derived from VIOLIN ecosystem resources.

Corpus size used in workflow: 298 abstracts
Fixed split framework used across models:
- 256 train
- 13 validation
- 29 test

Training Configuration (Fixed Manuscript Setting)

LoRA rank (r): 8
Learning rate: 2e-4
Epochs: 5
Quantization during fine-tuning: 4-bit NF4 with double quantization
Compute dtype: float16
Per-device batch size and gradient accumulation were configured for stable updates across model families.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

repo_id = "RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

Prompting Recommendation

Use prompts that explicitly request structured JSON output containing only:

adjuvant
evidence

and restrict extra commentary to reduce parsing errors.

Limitations

Evaluated on a focused infectious-disease adjuvant corpus; broader-domain generalization is not guaranteed.
Performance depends on abstract quality and terminology variation.
Structured output may still require post-processing and manual validation.

Ethical and Safety Notes

Outputs can contain extraction errors or unsupported predictions.
Human review is required before downstream knowledge integration.
Not for diagnosis, treatment, or direct patient-care decisions.

Reproducibility Resources

Code, notebooks, and workflow details are available at:
https://github.com/hurlab/Infectious-Disease-Adjuvant-LLM-Fine-tuning

Citation

If you use this model, please cite the associated manuscript and project repository.

Contact

For questions, please contact hasin.rehana@und.edu.

Downloads last month: 68

Safetensors

Model size

7B params

Tensor type

F16

Model tree for RehanaHasin/mistral-7b-instruct-v0.3-adjuvant-extractor

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Finetuned

(494)

this model