Instructions to use MrRoyaleAce/nyaya-7b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MrRoyaleAce/nyaya-7b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MrRoyaleAce/nyaya-7b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("MrRoyaleAce/nyaya-7b")
model = AutoModelForMultimodalLM.from_pretrained("MrRoyaleAce/nyaya-7b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

PEFT
How to use MrRoyaleAce/nyaya-7b with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use MrRoyaleAce/nyaya-7b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MrRoyaleAce/nyaya-7b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MrRoyaleAce/nyaya-7b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MrRoyaleAce/nyaya-7b

SGLang

How to use MrRoyaleAce/nyaya-7b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MrRoyaleAce/nyaya-7b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MrRoyaleAce/nyaya-7b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MrRoyaleAce/nyaya-7b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MrRoyaleAce/nyaya-7b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use MrRoyaleAce/nyaya-7b with Docker Model Runner:
```
docker model run hf.co/MrRoyaleAce/nyaya-7b
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

🏛️ Nyaya-7B — Indian Legal Judgment Parser

Nyaya-7B is a domain-adapted, instruction-finetuned version of Mistral-7B-Instruct-v0.3, trained on 10,000+ Indian Supreme Court and High Court judgments to extract structured legal information into clean, validated JSON — at zero API cost, fully offline.

"Nyaya" (न्याय) means justice in Sanskrit and Hindi.

🎯 What it does

Given raw Indian court judgment text, Nyaya-7B extracts a full structured JSON covering:

Field	Description
`case_name`	Petitioner v. Respondent
`citation`	AIR / SCC / SCR citation
`court`	Full court name (Supreme Court, High Court, etc.)
`year`	Year of judgment
`petitioner` / `respondent`	Party names
`subject_matter`	Criminal / Civil / Constitutional / Tax / ...
`statutes_cited`	List of Acts + Sections + descriptions
`precedents_cited`	AIR/SCC citations with case names
`legal_issues`	Issues framed by the court
`holding`	Court's decision and reasoning (1–3 sentences)
`outcome`	dismissed / allowed / disposed / remanded / modified

📊 Benchmark Results

Evaluated on a 50-case held-out test set of Indian SC/HC judgments, benchmarked head-to-head against Gemini 2.5 Flash:

Metric	Gemini 2.5 Flash	Nyaya-7B	Winner
Statute F1	0.227	0.425	🏆 Nyaya-7B (+87%)
Outcome Accuracy	0.20	0.64	🏆 Nyaya-7B (+220%)
Hallucination Rate ↓	0.775	0.427	🏆 Nyaya-7B (−45%)
JSON Validity	0.98	0.86	Gemini Flash
Field Coverage	0.893	0.855	Gemini Flash
Cost per Judgment	~$0.0001	$0.00	🏆 Nyaya-7B

Nyaya-7B achieves 3.2× higher outcome classification accuracy and 45% lower hallucination rate than Gemini 2.5 Flash, while running entirely offline at zero cost.

🚀 Quick Start

Installation

pip install transformers torch accelerate bitsandbytes

Load & Run (GPU, 4-bit quantized)

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
import torch, json

MODEL_ID = "mrroyaleace/nyaya-7b"   # replace with your HuggingFace repo

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_config,
    device_map="auto",
)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

Extract structured data from a judgment

judgment_text = """
IN THE SUPREME COURT OF INDIA
Criminal Appeal No. 1234 of 2022

State of Punjab                         ...Appellant
Versus
Gurpreet Singh                          ...Respondent

JUDGMENT

The appellant challenges the High Court's order acquitting the respondent 
of charges under Section 302 IPC read with Section 34 IPC...
"""

messages = [
    {"role": "system", "content": "You are Nyaya, a specialized Indian legal extraction model."},
    {"role": "user", "content": f"Extract structured data from this judgment and return JSON:\n\n{judgment_text}"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

raw_output = pipe(
    prompt,
    max_new_tokens=512,
    do_sample=False,
    return_full_text=False,
    pad_token_id=tokenizer.eos_token_id,
)[0]["generated_text"]

result = json.loads(raw_output.strip())
print(result)

Expected Output

{
  "case_name": "State of Punjab v. Gurpreet Singh",
  "citation": null,
  "court": "Supreme Court of India",
  "year": 2022,
  "petitioner": "State of Punjab",
  "respondent": "Gurpreet Singh",
  "subject_matter": "Criminal",
  "statutes_cited": [
    {"act": "Indian Penal Code", "section": "302", "description": "Punishment for murder"},
    {"act": "Indian Penal Code", "section": "34",  "description": "Acts done by several persons in furtherance of common intention"}
  ],
  "precedents_cited": [],
  "legal_issues": [
    "Whether the High Court was justified in acquitting the respondent under Section 302 IPC?"
  ],
  "holding": "The Supreme Court examined the evidence and found the High Court's reasoning sound...",
  "outcome": "dismissed"
}

CPU Inference (no GPU required)

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float32,
    device_map="cpu",
    low_cpu_mem_usage=True,
)
# Note: CPU inference is significantly slower (~5–15 min per judgment)

🔧 Training Details

Parameter	Value
Base model	mistralai/Mistral-7B-Instruct-v0.3
Fine-tuning method	QLoRA (Quantized Low-Rank Adaptation)
Quantization	4-bit NF4 with double quantization
LoRA rank	16
LoRA alpha	32
LoRA dropout	0.05
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training samples	~10,000 Indian SC/HC judgment pairs
Epochs	3
Effective batch size	16 (batch 2 × grad_accum 8)
Learning rate	2e-4
LR scheduler	Cosine
Optimizer	paged_adamw_8bit
Max sequence length	2048 tokens
Hardware	Kaggle T4 × 2 (32 GB VRAM total)
Training time	~8 hours
Compute dtype	float16

Training Infrastructure

Fine-tuned on Kaggle T4 × 2 GPU notebooks
Monitored with Weights & Biases (JSON validity rate logged per epoch)
Adapter merged into base model weights with merge_and_unload() for zero inference overhead

📚 Training Data

The model was trained on ~2,000 Indian court judgment pairs curated and labeled from:

ILSum — Indian Legal Summarization dataset (Supreme Court judgments)
InLegalNLP — Indian Legal NLP benchmark corpus

Labels were auto-generated using Gemini 2.5 Flash as a labelling oracle on the raw judgment texts, following the canonical extraction schema, then validated for JSON structure and field completeness.

⚠️ Limitations

Not for legal advice: This model extracts structured information only. It does not provide legal opinions or advice. Always consult a qualified lawyer for legal matters.
Pre-1950 judgments: May perform poorly on archaic legal language from older judgments.
Hindi/regional language text: Primarily trained on English-language judgments; performance degrades on mixed-language or vernacular text.
Scanned/handwritten PDFs: Model accepts only clean text input — OCR preprocessing is required for scanned documents.
Citation hallucination: Significantly reduced (42.7% vs 77.5% baseline), but the model can still occasionally generate plausible-but-incorrect section numbers. Always validate critical citations against primary sources.
Novel statutes: Statutes not well-represented in training data (e.g., recent 2023–24 acts) may have lower extraction accuracy.

✅ Intended Use

⚖️ Legal research and document processing automation
🤖 Paralegal workflow tools and legal analytics dashboards
📖 Academic research on Indian legal NLP
🔍 Building legal search and knowledge graph systems
📊 Bulk digitization of case records

❌ Out-of-Scope Use

Providing legal advice to individuals
Making or influencing judicial decisions
Use in actual legal proceedings without qualified human review
Any high-stakes decision-making without validation

📄 Output Schema

{
  "case_name":         str,                          # "Petitioner v. Respondent"
  "citation":          str | None,                   # "AIR 1997 SC 3986" or null
  "court":             str,                          # Full court name
  "year":              int | None,                   # 4-digit year
  "petitioner":        str,
  "respondent":        str,
  "subject_matter":    str | None,                   # Criminal | Civil | Constitutional | ...
  "statutes_cited":    [{"act": str, "section": str, "description": str}],
  "precedents_cited":  [{"citation": str, "case_name": str | None}],
  "legal_issues":      [str],
  "holding":           str,                          # 1-3 sentence summary
  "outcome":           str                           # dismissed | allowed | disposed | remanded | modified
}

🔗 Related Resources

🐙 GitHub: nyaya-legal-ai — Full project source code including FastAPI backend, LangGraph pipeline, and Streamlit UI
📊 Base Model: mistralai/Mistral-7B-Instruct-v0.3
📁 Dataset (ILSum): d0r1h/ILSum

📝 Citation

If you use Nyaya-7B in your research or applications, please cite:

@misc{nyaya7b2026,
  title   = {Nyaya-7B: A QLoRA Fine-tuned LLM for Indian Legal Judgment Parsing},
  author  = {Shubham Suman},
  year    = {2026},
  url     = {https://huggingface.co/mrroyaleace/nyaya-7b},
  note    = {Fine-tuned from mistralai/Mistral-7B-Instruct-v0.3 on 2,000+ Indian SC/HC judgments}
}

Built with ❤️ for the Indian legal research community. Nyaya-7B is open-source and free to use under the Apache 2.0 license.

Downloads last month: 72

Safetensors

Model size

7B params

Tensor type

F16

Model tree for MrRoyaleAce/nyaya-7b

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Finetuned

(505)

this model

Quantizations

1 model

Evaluation results

Statute F1
self-reported

0.425
Outcome Accuracy
self-reported

0.640
Hallucination Rate (lower is better)
self-reported

0.427
JSON Validity Rate
self-reported

0.860