BRAHMASTRA v0.3 — LoRA Adapter Only (1.1 GB)

LoRA adapter weights for Krishnapadala55/brahmastra-0.3. Apply on top of the base model unsloth/DeepSeek-R1-Distill-Qwen-32B for the same behavior as the merged model — but with a 60× smaller download.

Why download just the adapter?

1.1 GB vs 65 GB merged — much faster download
Compose with other adapters (mix & match)
Continue fine-tuning from this checkpoint
Run on top of any quantized version of the base

Quick Start (PEFT)

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_name = "unsloth/DeepSeek-R1-Distill-Qwen-32B"
adapter   = "Krishnapadala55/brahmastra-0.3-lora"

base = AutoModelForCausalLM.from_pretrained(base_name, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(base, adapter)
tokenizer = AutoTokenizer.from_pretrained(adapter)

# Optional: merge for faster inference (uses more VRAM)
model = model.merge_and_unload()

Quick Start with Unsloth (matches training recipe)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name      = "Krishnapadala55/brahmastra-0.3-lora",
    max_seq_length  = 4096,
    load_in_4bit    = True,
)
FastLanguageModel.for_inference(model)

Configuration

Field	Value
Adapter type	LoRA
Rank (r)	32
Alpha	64
Target modules	q/k/v/o/gate/up/down projections
Trainable params	268,435,456 (0.81% of base)
Bias	none
Dropout	0.0
Task type	CAUSAL_LM

Files

File	Size	Description
`adapter_model.safetensors`	1.1 GB	LoRA weights
`adapter_config.json`	1.2 KB	PEFT config
`tokenizer.json`	11 MB	Tokenizer
`tokenizer_config.json`	469 B	Tokenizer config
`chat_template.jinja`	2.5 KB	Chat template (Qwen-2.5)

Benchmarks

See parent repo for full benchmark suite vs v0.2: Krishnapadala55/brahmastra-0.3

License

Apache 2.0 — same as base.

Downloads last month: 17

Model tree for Krishnapadala55/brahmastra-0.3-lora

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

Finetuned

unsloth/DeepSeek-R1-Distill-Qwen-32B

Adapter

(2)

this model