Instructions to use ssuvetha/pharma-tinyllama-instruction-lora-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use ssuvetha/pharma-tinyllama-instruction-lora-adapter with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("/content/pharma_tinyllama_merged_model") model = PeftModel.from_pretrained(base_model, "ssuvetha/pharma-tinyllama-instruction-lora-adapter") - Notebooks
- Google Colab
- Kaggle
Pharma TinyLlama Instruction LoRA Adapter
This repository contains a LoRA adapter trained for instruction fine-tuning on pharma-domain instruction-response data.
This adapter was trained on top of the Stage 1 merged model:
- Base model for this stage:
ssuvetha/pharma-tinyllama-non-instruction-merged
That means this adapter builds on top of a model that was already domain-adapted on raw pharma text, and then further teaches it to respond in instruction / response format.
Model Type
- Stage: 2
- Training type: Instruction fine-tuning
- Adapter type: LoRA
- Training method: QLoRA-style fine-tuning
- Task: Instruction-following pharma response generation
What this stage adds
Stage 1 taught the model pharma language and domain style.
Stage 2 teaches the model how to respond to prompts like:
- Explain the mechanism of action of Metformin.
- Why do atorvastatin and ezetimibe work well together?
- Summarize the role of lipid nanoparticles in mRNA vaccines.
This stage improves:
- instruction following
- response formatting
- pharma-domain Q&A style outputs
- domain-aware assistant behavior
Training data format
The training data is formatted like:
### Instruction:
Explain the mechanism of action of Metformin.
### Response:
Metformin primarily activates AMPK...
Optional input fields may be formatted as:
### Instruction:
Summarize the following finding.
### Input:
<extra context here>
### Response:
...
Intended use
This adapter is intended for:
- pharma-domain instruction tuning experiments
- educational chatbot research
- Stage 2 in a multi-stage fine-tuning pipeline
- domain-specific assistant prototyping
Not intended use
This model is not intended for:
- medical diagnosis
- treatment recommendations
- clinical deployment
- emergency or safety-critical use
Training pipeline summary
The high-level Stage 2 pipeline was:
- Load the Stage 1 merged model
- Load pharma instruction dataset
- Format examples into instruction-response text
- Tokenize and pad to fixed length
- Add a fresh LoRA adapter
- Fine-tune on instruction data
- Save and upload adapter
- Merge for Stage 3 preference tuning
Training configuration summary
- Base model:
ssuvetha/pharma-tinyllama-non-instruction-merged - Max length: 512
- LoRA rank (
r): 16 - LoRA alpha: 32
- LoRA dropout: 0.05
- Learning rate: 1e-4
- Batch size per device: 1
- Gradient accumulation steps: 8
- Max steps: 5
- Quantization: 4-bit NF4
- Hardware: Google Colab T4 GPU
How to use
Load this adapter on top of the merged Stage 1 model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base_model_name = "ssuvetha/pharma-tinyllama-non-instruction-merged"
adapter_name = "ssuvetha/pharma-tinyllama-instruction-lora-adapter"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
quantization_config=bnb_config,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_name)
model.eval()
Example inference
prompt = """### Instruction:
Explain the primary mechanism of action of metformin.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.1,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Prompt format
Use this adapter with instruction-style prompts:
### Instruction:
<your question>
### Response:
Optional input variant:
### Instruction:
<your task>
### Input:
<extra context>
### Response:
Limitations
- trained on a small instruction dataset
- may hallucinate scientific details
- may produce plausible but incorrect medical content
- not safety aligned for clinical use
- not a substitute for licensed medical expertise
Project pipeline context
This adapter is part of a staged pharma fine-tuning project:
- Stage 1: non-instruction domain adaptation
- Stage 2: instruction fine-tuning
- Stage 3: preference tuning with DPO
This repository contains the Stage 2 adapter only.
Citation
If you use this model, please cite:
- TinyLlama
- PEFT / LoRA / QLoRA
- your project repository or notebook
- Downloads last month
- 10