Pharma TinyLlama Non-Instruction LoRA Adapter

This repository contains a LoRA adapter trained for non-instruction fine-tuning / domain-adaptive continued pretraining on pharma-domain text.

The adapter was trained on top of:

Base model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

This is not an instruction-following chatbot model by itself.
It was trained on raw pharma text so that the base model can better learn:

pharmaceutical terminology
drug names
biomedical writing style
scientific sentence patterns
domain-specific vocabulary

Model Type

Stage: 1
Training type: Non-instruction fine-tuning / continued pretraining
Adapter type: LoRA
Training method: QLoRA-style fine-tuning
Task: Causal language modeling / next-token prediction

What the model learned

This adapter was trained on raw pharma-domain text extracted from PDF-based source material.

Because the training was done in causal LM format, the model learns to continue domain text such as:

Metformin is one of the most widely prescribed oral antihyperglycemic agents...

It improves domain familiarity, but it was not explicitly trained to:

answer user questions
follow instructions
chat in assistant format
rank preferred responses

Intended use

This adapter is intended for:

domain adaptation experiments
Stage 1 in a multi-stage fine-tuning pipeline
continued pretraining demonstrations
pharmaceutical language modeling research
as a starting point before instruction tuning

Not intended use

This adapter should not be treated as:

a medical advice system
a clinically validated model
a final instruction-tuned assistant
a diagnosis or treatment recommendation engine

Training pipeline summary

The high-level Stage 1 pipeline was:

Extract pharma text from PDF
Clean and normalize the text
Split into paragraph records
Convert to Hugging Face Dataset
Tokenize text
Pack token sequences into fixed-length blocks
Load TinyLlama base model in 4-bit
Add LoRA adapter
Train the adapter with causal LM objective
Save and upload adapter

Training configuration summary

Base model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
Block size: 512
LoRA rank (r): 16
LoRA alpha: 32
LoRA dropout: 0.05
Learning rate: 2e-4
Batch size per device: 1
Gradient accumulation steps: 8
Epochs: 3
Quantization: 4-bit NF4
Hardware: Google Colab T4 GPU

How to use

Load the adapter on top of the TinyLlama base model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model_name = "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T"
adapter_name = "ssuvetha/pharma-tinyllama-non-instruction-lora-adapter"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_name)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, adapter_name)
model.eval()

Example inference

prompt = "Metformin is one of the most widely prescribed oral antihyperglycemic agents"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=120,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.1,
        pad_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompting note

This adapter works best with continuation-style prompts, not instruction prompts.

Good prompt style:

Metformin is one of the most widely prescribed oral antihyperglycemic agents

Less suitable prompt style:

Explain the mechanism of action of Metformin.

For instruction-style behavior, use the Stage 2 instruction adapter instead.

Limitations

trained on a small domain corpus
may overfit surface wording
may hallucinate facts
not evaluated for clinical safety
not suitable for real-world medical decision making

Project pipeline context

This adapter is part of a staged pharma fine-tuning project:

Stage 1: non-instruction domain adaptation
Stage 2: instruction fine-tuning
Stage 3: preference tuning with DPO

This repository contains the Stage 1 adapter only.

Citation

If you use this model, please cite:

TinyLlama base model
PEFT / LoRA / QLoRA libraries
your project repository or notebook

Downloads last month: 1

Model tree for ssuvetha/pharma-tinyllama-non-instruction-lora-adapter

Base model

TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

Adapter

(101)

this model