⚖️🐉 Indian Legal Qwen 2.5 — 1.5B

🟢 This is the fully merged model — ready for out-of-the-box inference with no extra setup. For lightweight adapter loading see the Adapter · For CPU/Ollama usage see the GGUF.

📖 Model Description

Indian Legal Qwen 2.5 — 1.5B is a domain-adapted version of unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit, fine-tuned using QLoRA on a structured question-answer dataset covering all 1,059 sections of India's three landmark 2023 criminal justice reform acts:

Act	Full Name	Replaces	Sections
📕 BNS 2023	Bharatiya Nyaya Sanhita	IPC 1860	358
📗 BNSS 2023	Bharatiya Nagarik Suraksha Sanhita	CrPC 1973	531
📘 BSA 2023	Bharatiya Sakshya Adhiniyam	Indian Evidence Act 1872	170

Trained on 6,354 instruction-format QA pairs — 6 question types per section covering definitions, scenarios, legal elements, exceptions, and consequences — giving it broad, structured coverage of India's reformed criminal law framework. At 1.5B parameters, this is the most lightweight model in the family — well suited for fast, low-resource deployment.

🔗 Model Family — Qwen 2.5 1.5B

Variant	Repo	Best For
🟢 Merged (this repo)	`GSMS-B/Indian-Legal-Qwen2.5-1.5B`	Out-of-the-box inference, Gradio / API deployment
🔵 LoRA Adapter	GSMS-B/Indian-Legal-Qwen2.5-1.5B-Adapter	Lightweight loading on top of base model
🟡 GGUF (Quantized)	GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF	CPU inference via Ollama / llama.cpp

🚀 Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id  = "GSMS-B/Indian-Legal-Qwen2.5-1.5B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

SYSTEM = "You are an expert legal assistant specializing in Indian criminal law — BNS, BNSS, and BSA 2023."

def ask(question):
    messages = [
        {"role": "system", "content": SYSTEM},
        {"role": "user",   "content": question}
    ]
    text   = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(text, return_tensors="pt").to(model.device)
    with torch.no_grad():
        out = model.generate(**inputs, max_new_tokens=300, temperature=0.1,
                             do_sample=True, pad_token_id=tokenizer.eos_token_id)
    return tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

print(ask("What is a Zero FIR under BNSS 2023?"))

💻 Run locally with Ollama (GGUF)

ollama run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF

🎯 Recommended Use Cases

⚠️ Important Note: This model has been domain-adapted on structured QA data and works best as a component in a larger pipeline rather than a standalone answer engine. Direct usage without retrieval context may produce incomplete or imprecise answers on complex legal queries.

✅ Where this model excels

Use Case	💡 How to Use
🔍 RAG Pipeline	Pair with a BM25 or vector retriever over BNS/BNSS/BSA texts; feed retrieved sections as context for grounded, citation-backed answers
🤖 Legal Chatbot Backend	Use as the generation backbone of a legal assistant app with a ChromaDB / FAISS document store
📚 Legal Education Tool	Build interactive Q&A apps for law students and practitioners learning the 2023 criminal justice reforms
🔎 Section Lookup Assistant	Combine with a section index to surface the exact BNS / BNSS / BSA provision relevant to a given situation
⚡ Low-Resource / Fast Inference	Smallest model in the family — ideal where latency or compute budget is tight
🧪 Further Fine-tuning	Use as a starting point for more specialised adaptation (e.g., only BNSS procedure, only BSA evidence rules)
📝 Structured Legal Summarization	Summarize individual sections when the section text is supplied as input context
⚖️ Comparative Law Analysis	Highlight differences between old acts (IPC/CrPC/IEA) and their 2023 replacements

❌ Not recommended for

Standalone legal advice without a retrieval component
High-stakes legal decisions without qualified human review
Jurisdictions or acts outside BNS / BNSS / BSA 2023

🏋️ Training Details

Property	Value
🤖 Base model	`unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit`
🔧 Fine-tuning method	QLoRA
🎛️ LoRA rank	64
🎛️ LoRA alpha	128
🧩 Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
📊 Training data	6,354 QA pairs — 1,059 sections × 6 question types
🔁 Epochs	3
📦 Batch size (per device)	4
📈 Learning rate	2e-4
⚙️ Optimizer	adamw_8bit
💻 Hardware	Google Colab T4 GPU
🛠️ Framework	Unsloth + TRL SFTTrainer
💬 Prompt format	ChatML

📊 Training Dataset

📂 Dataset	🔗 Link
Indian Legal QA — BNS + BNSS + BSA 2023	GSMS-B/Indian-Legal-QA-BNS-BNSS-BSA

6 question types per section: definitional_topic · definitional_section · scenario · elements · exceptions · consequence

👤 Author

GSMS-B — Bugatha Ganasyam Mani Sankar 🤗 Hugging Face Profile

⚠️ Disclaimer

This model is intended for research and educational purposes only. It does not constitute legal advice. Outputs should not be relied upon for any legal decision without review by a qualified legal professional. The model's responses reflect patterns in training data and may contain errors or omissions.

⚡ Fine-tuned using Unsloth for training efficiency.

Downloads last month: 18

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for GSMS-B/Indian-Legal-Qwen2.5-1.5B

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Quantized

unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit