⚖️ JurisSim

A Neuro-Symbolic Legal Auditor & Formal Verification Agent

📖 Overview

JurisSim-32B v3.1 is a neuro-symbolic legislative stress-tester designed for the AMD Instinct MI300X. It translates natural language legal clauses into Z3 SMT-LIB formal constraints to identify adversarial loopholes with mathematical certainty.

🚀 Version 3.1: Dual-Agent Swarm

We have moved beyond a single model to a Multi-Agent Feedback Loop:

The Auditor (Qwen3-32B): Translates legalese into symbolic logic.
The Skeptic (Qwen2.5-7B): A dedicated linter agent that dry-runs the generated Z3 code and performs real-time self-correction on syntax errors.

🛠 Features

Universal Logic Bridge: Pre-trained on abstract logical archetypes (Thresholds, Temporal Loops, Circularity).
Architecture v2.0 Sandbox: Automatic injection of physical invariants (Budget sums, Time-forward flow).
Agentic Linter: Automated repair of Z3 code hallucinations.
MI300X Optimized: Fully utilizes ROCm 6.2 and 192GB VRAM for high-fidelity reasoning.

📦 Deployment & Portability

To switch to another device (e.g., another MI300X or a high-VRAM workstation):

Clone the Repository:

git clone https://github.com/Mark-Joseph-42/JurisSim.git
cd JurisSim

Download the Weights: The merged 32B weights are available at markjoseph2003/JurisSim-32B-v3.
```
huggingface-cli download markjoseph2003/JurisSim-32B-v3 --local-dir jurissim-merged-v1
```
Run the Swarm:
```
python app.py
```

⚖️ Logical Archetypes Covered

Threshold Splitting: Identifying bypassed numeric limits.
Temporal Paradoxes: Detecting impossible timelines.
Jurisdictional Null-Zones: Set-theory conflicts between overlapping authorities.
Circular Deadlocks: Structural contradictions in statute drafting.

📚 Technical Documentation & Architecture

1. Hardware & Environment Stack

Compute: AMD Instinct™ MI300X Accelerator (192GB VRAM)
Software Stack: ROCm 6.2, PyTorch 2.5.1
Base Model: Qwen/Qwen3-32B
Training Method: QLoRA (Quantized Low-Rank Adaptation) using bitsandbytes

2. The Fine-Tuning Pipeline & ROCm Optimizations

Training a massive 32-billion parameter model on a single MI300X GPU required severe optimization to avoid mathematical overflows and Out of Memory (OOM) crashes.

The PyTorch SDPA ROCm Bug: During initial training runs, PyTorch's default sdpa (Scaled Dot-Product Attention) implementation suffered a math overflow bug on ROCm 6.2 when handling sequences with padding tokens alongside gradient checkpointing, resulting in catastrophic NaN gradients.
The "Ultra-Stable" Workaround: Because compiling flash_attention_2 for ROCm from source is time-prohibitive, we engineered an ultra-stable configuration that perfectly maximized the 192GB VRAM without hitting the bug:
1. Attention: Reverted to PyTorch's native eager mathematical attention block to avoid sdpa math corruption.
2. Memory Compression: Reduced per_device_train_batch_size=1 but massively increased gradient_accumulation_steps=8 to maintain an effective batch size of 8.
3. Checkpointing: Enabled gradient_checkpointing=True to prevent the eager attention matrices from consuming all 192GB of VRAM during the backward pass.
4. Evaluation Safety: Enforced per_device_eval_batch_size=1 to ensure the un-checkpointed validation phase did not crash the GPU.

3. Training Results

The model successfully converged after 3 Epochs (1,713 Steps).

Final Train Loss: 1.756
Final Validation Loss: 1.676 (Signaling excellent generalization and zero overfitting).
Token Prediction Accuracy: 62.61% (Extremely high for complex Legal English → Python Z3 logic translation).
Gradient Stability: Maintained a remarkably stable grad_norm of ~0.3 to ~0.8 throughout the run.

🚀 Quickstart & Installation

1. Clone the Repository

git clone https://github.com/Mark-Joseph-42/JurisSim.git
cd JurisSim

2. Install Dependencies

Ensure you are running on an AMD machine with ROCm installed.

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Ensure bitsandbytes is configured for ROCm
pip install https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/v0.44.1/bitsandbytes-0.44.1-py3-none-manylinux_2_24_x86_64.whl

3. Download the Fine-Tuned Model

The LoRA adapters are publicly available on Hugging Face:

huggingface-cli download markjoseph2003/JurisSim-32B-LoRA --local-dir ./jurissim-lora

4. Run the Agent (Coming Soon!)

(The Agentic Execution loop and Frontend UI are currently being finalized. Instructions for launching the UI will be placed here shortly).

🛠️ Hugging Face Space

The interactive demo of JurisSim will be deployed as a Hugging Face Space for the hackathon judging. The link will be provided upon completion of the UI.

Built with ❤️ for the AMD Developer Hackathon

Downloads last month: 51

Safetensors

Model size

33B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for markjoseph2003/JurisSim-32B-v3

Quantizations

1 model