Pharma TinyLlama Non-Instruction LoRA Adapter

This repository contains a LoRA adapter trained for non-instruction fine-tuning / domain-adaptive continued pretraining on pharma-domain text.

The adapter was trained on top of:

  • Base model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

This is not an instruction-following chatbot model by itself.
It was trained on raw pharma text so that the base model can better learn:

  • pharmaceutical terminology
  • drug names
  • biomedical writing style
  • scientific sentence patterns
  • domain-specific vocabulary

Model Type

  • Stage: 1
  • Training type: Non-instruction fine-tuning / continued pretraining
  • Adapter type: LoRA
  • Training method: QLoRA-style fine-tuning
  • Task: Causal language modeling / next-token prediction

What the model learned

This adapter was trained on raw pharma-domain text extracted from PDF-based source material.

Because the training was done in causal LM format, the model learns to continue domain text such as:

Metformin is one of the most widely prescribed oral antihyperglycemic agents...

It improves domain familiarity, but it was not explicitly trained to:

  • answer user questions
  • follow instructions
  • chat in assistant format
  • rank preferred responses

Intended use

This adapter is intended for:

  • domain adaptation experiments
  • Stage 1 in a multi-stage fine-tuning pipeline
  • continued pretraining demonstrations
  • pharmaceutical language modeling research
  • as a starting point before instruction tuning

Not intended use

This adapter should not be treated as:

  • a medical advice system
  • a clinically validated model
  • a final instruction-tuned assistant
  • a diagnosis or treatment recommendation engine

Training pipeline summary

The high-level Stage 1 pipeline was:

  1. Extract pharma text from PDF
  2. Clean and normalize the text
  3. Split into paragraph records
  4. Convert to Hugging Face Dataset
  5. Tokenize text
  6. Pack token sequences into fixed-length blocks
  7. Load TinyLlama base model in 4-bit
  8. Add LoRA adapter
  9. Train the adapter with causal LM objective
  10. Save and upload adapter

Training configuration summary

  • Base model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
  • Block size: 512
  • LoRA rank (r): 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Learning rate: 2e-4
  • Batch size per device: 1
  • Gradient accumulation steps: 8
  • Epochs: 3
  • Quantization: 4-bit NF4
  • Hardware: Google Colab T4 GPU

How to use

Load the adapter on top of the TinyLlama base model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model_name = "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T"
adapter_name = "ssuvetha/pharma-tinyllama-non-instruction-lora-adapter"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_name)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, adapter_name)
model.eval()

Example inference

prompt = "Metformin is one of the most widely prescribed oral antihyperglycemic agents"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=120,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.1,
        pad_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompting note

This adapter works best with continuation-style prompts, not instruction prompts.

Good prompt style:

  • Metformin is one of the most widely prescribed oral antihyperglycemic agents

Less suitable prompt style:

  • Explain the mechanism of action of Metformin.

For instruction-style behavior, use the Stage 2 instruction adapter instead.


Limitations

  • trained on a small domain corpus
  • may overfit surface wording
  • may hallucinate facts
  • not evaluated for clinical safety
  • not suitable for real-world medical decision making

Project pipeline context

This adapter is part of a staged pharma fine-tuning project:

  • Stage 1: non-instruction domain adaptation
  • Stage 2: instruction fine-tuning
  • Stage 3: preference tuning with DPO

This repository contains the Stage 1 adapter only.


Citation

If you use this model, please cite:

  • TinyLlama base model
  • PEFT / LoRA / QLoRA libraries
  • your project repository or notebook
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ssuvetha/pharma-tinyllama-non-instruction-lora-adapter