⚖️🐉 Indian Legal Qwen 2.5 — 1.5B

Base Model Type Domain Method Acts License

🟢 This is the fully merged model — ready for out-of-the-box inference with no extra setup. For lightweight adapter loading see the Adapter · For CPU/Ollama usage see the GGUF.


📖 Model Description

Indian Legal Qwen 2.5 — 1.5B is a domain-adapted version of unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit, fine-tuned using QLoRA on a structured question-answer dataset covering all 1,059 sections of India's three landmark 2023 criminal justice reform acts:

Act Full Name Replaces Sections
📕 BNS 2023 Bharatiya Nyaya Sanhita IPC 1860 358
📗 BNSS 2023 Bharatiya Nagarik Suraksha Sanhita CrPC 1973 531
📘 BSA 2023 Bharatiya Sakshya Adhiniyam Indian Evidence Act 1872 170

Trained on 6,354 instruction-format QA pairs — 6 question types per section covering definitions, scenarios, legal elements, exceptions, and consequences — giving it broad, structured coverage of India's reformed criminal law framework. At 1.5B parameters, this is the most lightweight model in the family — well suited for fast, low-resource deployment.


🔗 Model Family — Qwen 2.5 1.5B

Variant Repo Best For
🟢 Merged (this repo) GSMS-B/Indian-Legal-Qwen2.5-1.5B Out-of-the-box inference, Gradio / API deployment
🔵 LoRA Adapter GSMS-B/Indian-Legal-Qwen2.5-1.5B-Adapter Lightweight loading on top of base model
🟡 GGUF (Quantized) GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF CPU inference via Ollama / llama.cpp

🚀 Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id  = "GSMS-B/Indian-Legal-Qwen2.5-1.5B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

SYSTEM = "You are an expert legal assistant specializing in Indian criminal law — BNS, BNSS, and BSA 2023."

def ask(question):
    messages = [
        {"role": "system", "content": SYSTEM},
        {"role": "user",   "content": question}
    ]
    text   = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(text, return_tensors="pt").to(model.device)
    with torch.no_grad():
        out = model.generate(**inputs, max_new_tokens=300, temperature=0.1,
                             do_sample=True, pad_token_id=tokenizer.eos_token_id)
    return tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

print(ask("What is a Zero FIR under BNSS 2023?"))

💻 Run locally with Ollama (GGUF)

ollama run hf.co/GSMS-B/Indian-Legal-Qwen2.5-1.5B-GGUF

🎯 Recommended Use Cases

⚠️ Important Note: This model has been domain-adapted on structured QA data and works best as a component in a larger pipeline rather than a standalone answer engine. Direct usage without retrieval context may produce incomplete or imprecise answers on complex legal queries.

✅ Where this model excels

Use Case 💡 How to Use
🔍 RAG Pipeline Pair with a BM25 or vector retriever over BNS/BNSS/BSA texts; feed retrieved sections as context for grounded, citation-backed answers
🤖 Legal Chatbot Backend Use as the generation backbone of a legal assistant app with a ChromaDB / FAISS document store
📚 Legal Education Tool Build interactive Q&A apps for law students and practitioners learning the 2023 criminal justice reforms
🔎 Section Lookup Assistant Combine with a section index to surface the exact BNS / BNSS / BSA provision relevant to a given situation
Low-Resource / Fast Inference Smallest model in the family — ideal where latency or compute budget is tight
🧪 Further Fine-tuning Use as a starting point for more specialised adaptation (e.g., only BNSS procedure, only BSA evidence rules)
📝 Structured Legal Summarization Summarize individual sections when the section text is supplied as input context
⚖️ Comparative Law Analysis Highlight differences between old acts (IPC/CrPC/IEA) and their 2023 replacements

❌ Not recommended for

  • Standalone legal advice without a retrieval component
  • High-stakes legal decisions without qualified human review
  • Jurisdictions or acts outside BNS / BNSS / BSA 2023

🏋️ Training Details

Property Value
🤖 Base model unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit
🔧 Fine-tuning method QLoRA
🎛️ LoRA rank 64
🎛️ LoRA alpha 128
🧩 Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
📊 Training data 6,354 QA pairs — 1,059 sections × 6 question types
🔁 Epochs 3
📦 Batch size (per device) 4
📈 Learning rate 2e-4
⚙️ Optimizer adamw_8bit
💻 Hardware Google Colab T4 GPU
🛠️ Framework Unsloth + TRL SFTTrainer
💬 Prompt format ChatML

📊 Training Dataset

📂 Dataset 🔗 Link
Indian Legal QA — BNS + BNSS + BSA 2023 GSMS-B/Indian-Legal-QA-BNS-BNSS-BSA

6 question types per section: definitional_topic · definitional_section · scenario · elements · exceptions · consequence


👤 Author

GSMS-B — Bugatha Ganasyam Mani Sankar 🤗 Hugging Face Profile


⚠️ Disclaimer

This model is intended for research and educational purposes only. It does not constitute legal advice. Outputs should not be relied upon for any legal decision without review by a qualified legal professional. The model's responses reflect patterns in training data and may contain errors or omissions.


⚡ Fine-tuned using Unsloth for training efficiency.

Downloads last month
18
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GSMS-B/Indian-Legal-Qwen2.5-1.5B

Finetuned
(92)
this model

Dataset used to train GSMS-B/Indian-Legal-Qwen2.5-1.5B