Instructions to use MainStack/marvy-1-14B-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use MainStack/marvy-1-14B-lora with PEFT:
Task type is invalid.
- MLX
How to use MainStack/marvy-1-14B-lora with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("MainStack/marvy-1-14B-lora") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- MLX LM
How to use MainStack/marvy-1-14B-lora with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "MainStack/marvy-1-14B-lora" --prompt "Once upon a time"
marvy-1-14B-lora
LoRA adapter for marvy-1-14B โ the first open model for the full ServiceNow delivery lifecycle. Compose on top of Qwen2.5-14B-Instruct.
This is the adapter-only release (~175 MB). Apply it on
Qwen/Qwen2.5-14B-Instruct
to specialize the base for end-to-end ServiceNow delivery work. For ready-to-run
weights see the merged model
MainStack/marvy-1-14B or the
quantized MainStack/marvy-1-14B-GGUF.
Released under Apache-2.0. Built with Qwen โ see
NOTICE.
๐ Full usage (all runtimes + OpenCode wiring): USAGE.md ยท
Validate it works: VALIDATION.md
What it does
Fine-tunes the base for business analysis, requirements, stakeholder mapping, systems inventory, Solution Design Documents, user stories with acceptance criteria, implementation planning, test-case generation, validation/critique, and end-to-end delivery chains (story โ implementation โ test).
Usage
MLX (Apple Silicon)
pip install mlx-lm
python -m mlx_lm generate \
--model Qwen/Qwen2.5-14B-Instruct \
--adapter-path . \
--system-prompt "You are a senior ServiceNow delivery consultant..." \
--prompt "Write a user story with acceptance criteria for P1 SLA escalation." \
--max-tokens 1024 --temp 0.4
PEFT (Transformers)
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = "Qwen/Qwen2.5-14B-Instruct"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, "MainStack/marvy-1-14B-lora")
Note: the adapter was trained with MLX-LM. The MLX
adapter_config.json/adapters.safetensorsare included. A PEFT-format conversion is provided for Transformers users where available; otherwise prefer the MLX path or the merged model.
Training summary
| Setting | Value |
|---|---|
| Method | LoRA SFT (rank 32, scale 20, dropout 0.0) |
| Target keys | q/k/v/o_proj, gate/up/down_proj (top 16 layers) |
| Max seq length | 8,192 |
| Effective batch | 16 (batch 1 ร grad-accum 16) |
| Best checkpoint | iter 150 (best validation loss) |
| Framework | MLX-LM 0.31.3 on Apple Silicon |
See the merged model card for full dataset, evaluation, and limitations.
License & attribution
Dual-licensed: weights Apache-2.0, MainStack contributions (cards, docs,
benchmark) CC-BY-4.0 โ see LICENSING.md. If you use
marvy-1-14B as a baseline, fine-tune it, distill from it, or evaluate against
it, please credit MainStack and link to
https://huggingface.co/MainStack/marvy-1-14B. Keep the NOTICE file intact
(required by Apache-2.0 ยง4) and cite the entry on the
merged model card.
- Downloads last month
- 31
Quantized