Instructions to use MrRoyaleAce/nyaya-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MrRoyaleAce/nyaya-7b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MrRoyaleAce/nyaya-7b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("MrRoyaleAce/nyaya-7b") model = AutoModelForMultimodalLM.from_pretrained("MrRoyaleAce/nyaya-7b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - PEFT
How to use MrRoyaleAce/nyaya-7b with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use MrRoyaleAce/nyaya-7b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MrRoyaleAce/nyaya-7b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MrRoyaleAce/nyaya-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MrRoyaleAce/nyaya-7b
- SGLang
How to use MrRoyaleAce/nyaya-7b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MrRoyaleAce/nyaya-7b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MrRoyaleAce/nyaya-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MrRoyaleAce/nyaya-7b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MrRoyaleAce/nyaya-7b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use MrRoyaleAce/nyaya-7b with Docker Model Runner:
docker model run hf.co/MrRoyaleAce/nyaya-7b
🏛️ Nyaya-7B — Indian Legal Judgment Parser
Nyaya-7B is a domain-adapted, instruction-finetuned version of Mistral-7B-Instruct-v0.3, trained on 10,000+ Indian Supreme Court and High Court judgments to extract structured legal information into clean, validated JSON — at zero API cost, fully offline.
"Nyaya" (न्याय) means justice in Sanskrit and Hindi.
🎯 What it does
Given raw Indian court judgment text, Nyaya-7B extracts a full structured JSON covering:
| Field | Description |
|---|---|
case_name |
Petitioner v. Respondent |
citation |
AIR / SCC / SCR citation |
court |
Full court name (Supreme Court, High Court, etc.) |
year |
Year of judgment |
petitioner / respondent |
Party names |
subject_matter |
Criminal / Civil / Constitutional / Tax / ... |
statutes_cited |
List of Acts + Sections + descriptions |
precedents_cited |
AIR/SCC citations with case names |
legal_issues |
Issues framed by the court |
holding |
Court's decision and reasoning (1–3 sentences) |
outcome |
dismissed / allowed / disposed / remanded / modified |
📊 Benchmark Results
Evaluated on a 50-case held-out test set of Indian SC/HC judgments, benchmarked head-to-head against Gemini 2.5 Flash:
| Metric | Gemini 2.5 Flash | Nyaya-7B | Winner |
|---|---|---|---|
| Statute F1 | 0.227 | 0.425 | 🏆 Nyaya-7B (+87%) |
| Outcome Accuracy | 0.20 | 0.64 | 🏆 Nyaya-7B (+220%) |
| Hallucination Rate ↓ | 0.775 | 0.427 | 🏆 Nyaya-7B (−45%) |
| JSON Validity | 0.98 | 0.86 | Gemini Flash |
| Field Coverage | 0.893 | 0.855 | Gemini Flash |
| Cost per Judgment | ~$0.0001 | $0.00 | 🏆 Nyaya-7B |
Nyaya-7B achieves 3.2× higher outcome classification accuracy and 45% lower hallucination rate than Gemini 2.5 Flash, while running entirely offline at zero cost.
🚀 Quick Start
Installation
pip install transformers torch accelerate bitsandbytes
Load & Run (GPU, 4-bit quantized)
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
import torch, json
MODEL_ID = "mrroyaleace/nyaya-7b" # replace with your HuggingFace repo
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
quantization_config=bnb_config,
device_map="auto",
)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
Extract structured data from a judgment
judgment_text = """
IN THE SUPREME COURT OF INDIA
Criminal Appeal No. 1234 of 2022
State of Punjab ...Appellant
Versus
Gurpreet Singh ...Respondent
JUDGMENT
The appellant challenges the High Court's order acquitting the respondent
of charges under Section 302 IPC read with Section 34 IPC...
"""
messages = [
{"role": "system", "content": "You are Nyaya, a specialized Indian legal extraction model."},
{"role": "user", "content": f"Extract structured data from this judgment and return JSON:\n\n{judgment_text}"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
raw_output = pipe(
prompt,
max_new_tokens=512,
do_sample=False,
return_full_text=False,
pad_token_id=tokenizer.eos_token_id,
)[0]["generated_text"]
result = json.loads(raw_output.strip())
print(result)
Expected Output
{
"case_name": "State of Punjab v. Gurpreet Singh",
"citation": null,
"court": "Supreme Court of India",
"year": 2022,
"petitioner": "State of Punjab",
"respondent": "Gurpreet Singh",
"subject_matter": "Criminal",
"statutes_cited": [
{"act": "Indian Penal Code", "section": "302", "description": "Punishment for murder"},
{"act": "Indian Penal Code", "section": "34", "description": "Acts done by several persons in furtherance of common intention"}
],
"precedents_cited": [],
"legal_issues": [
"Whether the High Court was justified in acquitting the respondent under Section 302 IPC?"
],
"holding": "The Supreme Court examined the evidence and found the High Court's reasoning sound...",
"outcome": "dismissed"
}
CPU Inference (no GPU required)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.float32,
device_map="cpu",
low_cpu_mem_usage=True,
)
# Note: CPU inference is significantly slower (~5–15 min per judgment)
🔧 Training Details
| Parameter | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
| Fine-tuning method | QLoRA (Quantized Low-Rank Adaptation) |
| Quantization | 4-bit NF4 with double quantization |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training samples | ~10,000 Indian SC/HC judgment pairs |
| Epochs | 3 |
| Effective batch size | 16 (batch 2 × grad_accum 8) |
| Learning rate | 2e-4 |
| LR scheduler | Cosine |
| Optimizer | paged_adamw_8bit |
| Max sequence length | 2048 tokens |
| Hardware | Kaggle T4 × 2 (32 GB VRAM total) |
| Training time | ~8 hours |
| Compute dtype | float16 |
Training Infrastructure
- Fine-tuned on Kaggle T4 × 2 GPU notebooks
- Monitored with Weights & Biases (JSON validity rate logged per epoch)
- Adapter merged into base model weights with
merge_and_unload()for zero inference overhead
📚 Training Data
The model was trained on ~2,000 Indian court judgment pairs curated and labeled from:
- ILSum — Indian Legal Summarization dataset (Supreme Court judgments)
- InLegalNLP — Indian Legal NLP benchmark corpus
Labels were auto-generated using Gemini 2.5 Flash as a labelling oracle on the raw judgment texts, following the canonical extraction schema, then validated for JSON structure and field completeness.
⚠️ Limitations
- Not for legal advice: This model extracts structured information only. It does not provide legal opinions or advice. Always consult a qualified lawyer for legal matters.
- Pre-1950 judgments: May perform poorly on archaic legal language from older judgments.
- Hindi/regional language text: Primarily trained on English-language judgments; performance degrades on mixed-language or vernacular text.
- Scanned/handwritten PDFs: Model accepts only clean text input — OCR preprocessing is required for scanned documents.
- Citation hallucination: Significantly reduced (42.7% vs 77.5% baseline), but the model can still occasionally generate plausible-but-incorrect section numbers. Always validate critical citations against primary sources.
- Novel statutes: Statutes not well-represented in training data (e.g., recent 2023–24 acts) may have lower extraction accuracy.
✅ Intended Use
- ⚖️ Legal research and document processing automation
- 🤖 Paralegal workflow tools and legal analytics dashboards
- 📖 Academic research on Indian legal NLP
- 🔍 Building legal search and knowledge graph systems
- 📊 Bulk digitization of case records
❌ Out-of-Scope Use
- Providing legal advice to individuals
- Making or influencing judicial decisions
- Use in actual legal proceedings without qualified human review
- Any high-stakes decision-making without validation
📄 Output Schema
{
"case_name": str, # "Petitioner v. Respondent"
"citation": str | None, # "AIR 1997 SC 3986" or null
"court": str, # Full court name
"year": int | None, # 4-digit year
"petitioner": str,
"respondent": str,
"subject_matter": str | None, # Criminal | Civil | Constitutional | ...
"statutes_cited": [{"act": str, "section": str, "description": str}],
"precedents_cited": [{"citation": str, "case_name": str | None}],
"legal_issues": [str],
"holding": str, # 1-3 sentence summary
"outcome": str # dismissed | allowed | disposed | remanded | modified
}
🔗 Related Resources
- 🐙 GitHub: nyaya-legal-ai — Full project source code including FastAPI backend, LangGraph pipeline, and Streamlit UI
- 📊 Base Model: mistralai/Mistral-7B-Instruct-v0.3
- 📁 Dataset (ILSum): d0r1h/ILSum
📝 Citation
If you use Nyaya-7B in your research or applications, please cite:
@misc{nyaya7b2026,
title = {Nyaya-7B: A QLoRA Fine-tuned LLM for Indian Legal Judgment Parsing},
author = {Shubham Suman},
year = {2026},
url = {https://huggingface.co/mrroyaleace/nyaya-7b},
note = {Fine-tuned from mistralai/Mistral-7B-Instruct-v0.3 on 2,000+ Indian SC/HC judgments}
}
Built with ❤️ for the Indian legal research community. Nyaya-7B is open-source and free to use under the Apache 2.0 license.
- Downloads last month
- 72
Model tree for MrRoyaleAce/nyaya-7b
Evaluation results
- Statute F1self-reported0.425
- Outcome Accuracyself-reported0.640
- Hallucination Rate (lower is better)self-reported0.427
- JSON Validity Rateself-reported0.860