Instructions to use deepakdsoni/antahkarana-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepakdsoni/antahkarana-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deepakdsoni/antahkarana-7B")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("deepakdsoni/antahkarana-7B") model = AutoModelForMultimodalLM.from_pretrained("deepakdsoni/antahkarana-7B") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use deepakdsoni/antahkarana-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepakdsoni/antahkarana-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepakdsoni/antahkarana-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/deepakdsoni/antahkarana-7B
- SGLang
How to use deepakdsoni/antahkarana-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepakdsoni/antahkarana-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepakdsoni/antahkarana-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepakdsoni/antahkarana-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepakdsoni/antahkarana-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use deepakdsoni/antahkarana-7B with Docker Model Runner:
docker model run hf.co/deepakdsoni/antahkarana-7B
- Antahkarana-7B
Antahkarana-7B
A lifelong-learning language model that learns new domains without forgetting old ones, and abstains instead of hallucinating when it is unsure — built on Mistral-7B with an AI architecture derived from the 2,500-year-old Vedic model of mind (the antaḥkaraṇa, "inner instrument").
Author: Deepak Soni · License: Apache-2.0 · Base model: Mistral-7B-v0.1
This is a standalone, full-weights 7B model: load it with a plain from_pretrained — no adapter, no PEFT.
The continual-learning architecture was trained in via LoRA and then merged into the base weights.
📦 Model family
| Model | What |
|---|---|
| antahkarana-v1 | the original architecture + v1 vision models — the most stable continual learner (only positive backward transfer) |
| antahkarana-v2 | accuracy-recovering v2 (36.5M) — matches SOTA accuracy at ~3× less forgetting |
| antahkarana-7B | the architecture scaled to a 7B language model |
At a glance — what makes this different
Standard fine-tuning suffers catastrophic forgetting: teach a model a new task and it loses the old one. Antahkarana-7B is trained with a small set of cognitive "faculties," each derived from a Vedic concept and implemented as a concrete mechanism:
| Faculty (Vedic) | Mechanism (ML) | What it does |
|---|---|---|
| saṃskāra | Fisher-importance consolidation + decay over LoRA | protects what mattered for old domains → don't forget |
| vijñāna-smṛti | dark-knowledge / exemplar replay | rehearses past domains while learning new ones |
| pramāṇa | calibrated-confidence gate | abstains ("I'm not sure") instead of hallucinating |
| manas / buddhi | two decorrelated views, cross-teaching | safe self-learning from unlabeled data (research track) |
How it works
The borrowed mind (Mistral-7B) stays frozen as the stable core (śruti); a small trainable instrument (chitta = LoRA, ~0.2% of params) learns new domains, guided by the faculties — and the pramāṇa gate decides whether to answer or abstain:
Measured outcome (continual instruction-tuning, 4 domains, 3 seeds): ~3.8× less forgetting than naive LoRA, with higher and far more stable accuracy.
The journey: from a 2,500-year-old architecture to a 7B model
This model is the production endpoint of a multi-stage research-to-engineering program.
1. The architecture. The Vedic tradition describes the mind as an antaḥkaraṇa — an "inner instrument" of distinct faculties (chitta/memory, manas/perception, buddhi/discernment, ahaṃkāra/identity, plus pramāṇa/valid knowledge and the guṇa dynamics). Each faculty was mapped to a concrete, testable ML mechanism.
2. Research validation (vision, 36–52M params). The mechanisms were first proven on continual-learning image benchmarks (Split-CIFAR-100, Split-Tiny-ImageNet) against the field's standard methods (EWC, ER, DER++): the architecture was the most stable learner tested and the only one with positive backward transfer, with a clean ablation showing each Vedic-derived component adds value.
3. Scaling on a frozen modern backbone (E1–E2). On a frozen ViT-B/16, the consolidation works in adapter space, matching the SOTA (DER++) on accuracy while forgetting less, and extends to the harder class-incremental setting with label-free novelty detection (avidyā).
4. Self-learning and memory (E-S, śruti/smṛti/nidrā). The model learns from unlabeled data via decorrelated co-training and reaches near-supervised accuracy from ~2% labels; a complementary study showed an external "smṛti" memory + periodic "sleep" consolidation retains knowledge ~2.4× better than holding it in weights.
5. The 7B model (E7). The architecture was ported to language: frozen Mistral-7B + LoRA + saṃskāra + vijñāna-smṛti + pramāṇa, continually instruction-tuned across four domains with checkpointing, then merged into the standalone 7B model published here.
Results
Continual instruction-tuning — naive LoRA vs Antaḥkaraṇa-LoRA (3-seed mean ± std)
Four text-classification domains learned in sequence (AG News → DBpedia → Emotion → SST-2), each with its own label space, so forgetting is meaningful.
| Metric | naive LoRA | Antaḥkaraṇa (this model) |
|---|---|---|
| Final accuracy ↑ | 0.849 ± .029 | 0.882 ± .003 |
| Forgetting ↓ | 0.053 ± .032 | 0.014 ± .009 (~3.8× less) |
| Confidence on known domains | 0.841 | 0.954 |
| Known − unknown confidence gap ↑ | 0.467 | 0.494 |
Live deployment test (this merged model)
- General language preserved — correct world-knowledge answers (e.g. capital of Japan → Tokyo; a fluent one-sentence definition of photosynthesis).
- Continual retention: 8/8 correct across all four domains, including the first one learned — no catastrophic forgetting, demonstrated live.
- pramāṇa abstention — on a factually neutral input (no sentiment to extract), confidence drops to 0.53 and the model abstains rather than guessing; on clear inputs it stays 0.97–0.99 and answers.
Why this is an innovation in today's AI
Most of modern AI is static: a model is trained once, frozen, and shipped. Teaching it something new means expensive retraining — and naive fine-tuning overwrites old knowledge (catastrophic forgetting). The field's strongest continual-learning methods buy stability only by trading away accuracy, or vice-versa.
Antaḥkaraṇa breaks that trade-off. Across a rigorous benchmark vs the standard methods (EWC, LwF, ER, DER++), it is the only method that lands in the "ideal corner" — high accuracy and very low forgetting — matching the SOTA's accuracy while forgetting ~3× less:
That combination is what makes a model genuinely lifelong: it can keep learning in deployment without expensive retraining and without losing what it already knew — while the pramāṇa gate lets it say "I don't know" instead of hallucinating. A static, occasionally-confident model becomes a living, honest one. That is the shift the architecture is reaching for.
Potential — and where it needs to adapt
What this architecture could unlock:
- Lifelong enterprise models — absorb new products, policies, and data continuously, without retraining the base or forgetting prior knowledge.
- Trustworthy / high-stakes AI — calibrated abstention (pramāṇa) for medical, legal, and financial settings where "I'm not sure" is safer than a confident guess.
- Label-efficient & self-learning — learns from unlabeled data (co-training), reaching near-supervised accuracy from as little as ~2% labels — cutting annotation cost dramatically.
- Personal / on-device AI — a tiny adapter (~160 MB) + external memory personalizes a frozen base to a user, privacy-preserving, with no full retraining.
- Agentic memory — the śruti (stable core) / smṛti (external memory) / nidrā (sleep-consolidation) design gives agents that accumulate experience over time.
Where it still needs to adapt (honest roadmap):
- Beyond classification — the LLM evaluation here is classification framed as generation; it needs extension to open-ended instruction-following and longer, more realistic domain streams.
- Sharper pramāṇa — the abstention gate works but is over-confident on adversarial nonsense; it needs stronger calibration (e.g. conformal / ensemble methods) at scale.
- Scale & breadth — validated on 4 domains and 7B; next is longer continual streams, established continual-LLM benchmarks, and larger models (13B–70B).
- Self-learning + memory at LLM scale — co-training and the smṛti/nidrā memory are proven in vision and small setups; integrating them into the LLM continual loop is the next build.
- Conditional compute — a guṇa-driven mixture-of-experts / early-exit layer (efficiency) is designed but not yet implemented.
Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("deepakdsoni/antahkarana-7B")
model = AutoModelForCausalLM.from_pretrained(
"deepakdsoni/antahkarana-7B", dtype=torch.bfloat16, device_map="auto")
prompt = ("Classify the sentiment of this movie review (negative, positive).\n"
"Text: a heartfelt, beautifully acted triumph.\nAnswer:")
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device),
max_new_tokens=4, pad_token_id=tok.eos_token_id)
print(tok.decode(out[0], skip_special_tokens=True))
Requires a GPU for full-precision inference (~15 GB in bf16); 4-bit quantization (bitsandbytes) runs in ~5 GB.
Training details
| Base model | mistralai/Mistral-7B-v0.1 (frozen) |
| Adapter | LoRA (r=16, α=32) on q/k/v/o_proj; ~13.6M trainable (0.19%) |
| Method | saṃskāra (Fisher Ω + decay) on LoRA · vijñāna-smṛti exemplar replay · pramāṇa confidence gate |
| Curriculum | 4 classification domains in sequence, per-task checkpointing (resumable) |
| Merge | LoRA folded into base via merge_and_unload → standalone full-weights 7B |
| Precision | bfloat16 |
To continue lifelong-learning (add new domains with saṃskāra protection), use the LoRA adapter + resume workflow rather than this merged checkpoint — merging flattens the LoRA structure.
Limitations & honest notes
- Continual evaluation is on classification framed as generation (clean, measurable), not open-ended instruction following — a natural next extension.
- The pramāṇa gate is not perfect: it abstains well on genuinely under-determined input but can still be over-confident on adversarial nonsense; the robust evidence is the calibration AUROC and the in-distribution-vs-unfamiliar confidence gap across many examples.
- The model inherits the capabilities, biases, and knowledge cutoff of Mistral-7B-v0.1.
License & attribution
Released under the Apache License 2.0. This is a derivative work of Mistral-7B-v0.1 (© Mistral AI,
Apache-2.0) — see the NOTICE file. The base 7B weights were used as a frozen foundation and were not trained
from scratch. The Antaḥkaraṇa architecture, continual training, and merging are the contribution of the author.
Built on the Upaniṣads, Sāṃkhya, Yoga, Nyāya, and modern ML (PyTorch · Transformers · PEFT).
Citation
@misc{antahkarana7b2026,
title = {Antahkarana-7B: Lifelong Learning with a Vedic-Derived Cognitive Architecture},
author = {Deepak Soni},
year = {2026},
note = {Built on Mistral-7B-v0.1 (Apache-2.0)},
url = {https://huggingface.co/deepakdsoni/antahkarana-7B}
}
- Downloads last month
- 60


