SpaceLLM v1 — LoRA Adapter for Space Domain QA

SpaceLLM v1 is a parameter-efficient LoRA adapter fine-tuned on top of openai/gpt-oss-20b for space-domain question answering. Only the lm_head is trained; the full transformer backbone remains frozen, keeping the adapter extremely lightweight while steering the model's output distribution toward space mission knowledge.


Model Details

Model Description

  • Developed by: AdityaPS
  • Model type: LoRA adapter (PEFT) over a causal language model
  • Base model: openai/gpt-oss-20b (22B params, BF16/MXFP4)
  • Language(s): English
  • License: Apache 2.0 (inherited from base model)
  • Fine-tuned from: openai/gpt-oss-20b
  • PEFT version: 0.19.1
  • Fine-tuning strategy: LoRA on lm_head only — backbone fully frozen (BF16, NOT QLoRA)

Model Sources


Uses

Direct Use

Load alongside openai/gpt-oss-20b for space-domain conversational question answering. The model expects inputs formatted using the harmony response format (gpt-oss-20b's required chat template) — passing raw text without the template will degrade output quality.

Downstream Use

Can be plugged into RAG pipelines, mission-planning assistants, or educational tools focused on space science, satellite operations, and related domains.

Out-of-Scope Use

  • General-purpose chat without space-domain context
  • Tasks requiring multi-modal input (images, structured data)
  • Deployment without the base model (openai/gpt-oss-20b must be loaded alongside the adapter)

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer, Mxfp4Config
from peft import PeftModel

# Load base model (requires ~44 GB VRAM in BF16, or use MXFP4 for lower memory)
base_model = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    quantization_config=Mxfp4Config(dequantize=True),  # dequantizes to BF16
    device_map="auto",
    trust_remote_code=True,
)

# Load LoRA adapter on top
model = PeftModel.from_pretrained(base_model, "AdityaPS/SpaceLLM_v1")
tokenizer = AutoTokenizer.from_pretrained("AdityaPS/SpaceLLM_v1")

# Inference — must use harmony chat template
messages = [
    {"role": "system", "content": "You are a space domain expert assistant."},
    {"role": "user",   "content": "What is the purpose of a Sun-synchronous orbit?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Note: openai/gpt-oss-20b uses the harmony response format. Always use tokenizer.apply_chat_template() — do not pass raw text directly.


Training Details

Training Data

Fine-tuned on an internal space-domain QA dataset (DatasetA_core_QA_v2) consisting of multi-turn conversational records with system, user, and assistant turns. Records are tagged with metadata fields including organization, difficulty, aspect, and chain_id for multi-hop reasoning chains.

Split Records
Train ~4,800
Validation
Test 5,291

Training Procedure

Key Design Choices

  • LoRA applied to lm_head only — the full MoE transformer backbone is frozen.
  • Critical fix: lm_head.weight is physically untied from embed_tokens.weight via detach().clone() before get_peft_model() is called. Without this, autograd sees lm_head and embed_tokens as the same tensor, cutting gradients to lora_A.
  • Device-aware CE loss injected to handle MoE multi-GPU sharding where lm_head may land on a different device from the labels.
  • Model loaded in MXFP4 and dequantized to BF16 before LoRA application.

Training Hyperparameters

Hyperparameter Value
Training regime BF16 mixed precision
LoRA rank (r) 32
LoRA alpha 128
LoRA dropout 0.1
Target modules lm_head
Learning rate 2e-4
LR scheduler cosine with restarts
Optimizer adamw_torch_fused
Batch size 1
Gradient accumulation 32 (effective batch = 32)
Max grad norm 0.3
Weight decay 0.01
Warmup steps 200
Max sequence length 2,048
Epochs 5
Early stopping patience 8 eval steps
Vocab size (padded) 200,064
Hardware Multi-GPU (cuda:1, cuda:2)

Evaluation

Testing Data

Evaluation was run on the held-out test split of DatasetA_core_QA_v2 (5,291 records, covering diverse space organizations and difficulty levels).

Metrics

  • Loss — mean cross-entropy loss on the assistant response tokens
  • Exact Match (EM) — generated answer matches reference exactly (case-insensitive)
  • Token F1 — word-overlap F1 between generated and reference answers
  • BERTScore — semantic similarity using roberta-large

Results

BERTScore (roberta-large)

Metric Score
Precision 0.8736
Recall 0.8857
F1 0.8795

The BERTScore F1 of 0.8795 indicates strong semantic alignment between the model's generated answers and the reference answers across the full test set.


Environmental Impact

Carbon emissions estimated using the Machine Learning Impact calculator (Lacoste et al., 2019).

  • Hardware type: NVIDIA multi-GPU (cuda:1, cuda:2)
  • Hours used: ~6.6 hours (396.58 min inference; training time not reported)
  • Cloud provider: Not applicable (on-premise)
  • Compute region: Not reported
  • Carbon emitted: Not measured

Technical Specifications

Model Architecture and Objective

  • Architecture: Mixture-of-Experts (MoE) causal language model (gpt-oss-20b) with a LoRA adapter injected at the lm_head projection layer
  • Active parameters during inference: 3.6B (out of 21B total)
  • LoRA parameters: ~4 × vocab_size (two low-rank matrices of rank 32, applied to a single linear layer)
  • Objective: Next-token prediction with cross-entropy loss, masked so that only assistant response tokens contribute to the loss

Compute Infrastructure

  • Training hardware: 2× NVIDIA GPUs (indices 1 and 2), dispatched via accelerate.dispatch_model
  • Framework: PyTorch + HuggingFace Transformers + PEFT 0.19.1 + Accelerate


Model Card Authors

AdityaPS

Model Card Contact

[Open an issue or discussion on the HuggingFace repository]

Framework versions

  • PEFT 0.19.1
Downloads last month
193
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AdityaPS/SpaceLLM_v1

Adapter
(238)
this model