Instructions to use chauben/advisorai-qwen2.5-14b-stevens with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use chauben/advisorai-qwen2.5-14b-stevens with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="chauben/advisorai-qwen2.5-14b-stevens")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("chauben/advisorai-qwen2.5-14b-stevens")
model = AutoModelForCausalLM.from_pretrained("chauben/advisorai-qwen2.5-14b-stevens")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use chauben/advisorai-qwen2.5-14b-stevens with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "chauben/advisorai-qwen2.5-14b-stevens"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "chauben/advisorai-qwen2.5-14b-stevens",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/chauben/advisorai-qwen2.5-14b-stevens

SGLang

How to use chauben/advisorai-qwen2.5-14b-stevens with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "chauben/advisorai-qwen2.5-14b-stevens" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "chauben/advisorai-qwen2.5-14b-stevens",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "chauben/advisorai-qwen2.5-14b-stevens" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "chauben/advisorai-qwen2.5-14b-stevens",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use chauben/advisorai-qwen2.5-14b-stevens with Docker Model Runner:
```
docker model run hf.co/chauben/advisorai-qwen2.5-14b-stevens
```

AdvisorAI — Qwen2.5-14B Stevens (Fine-Tuned)

Fine-tuned Qwen/Qwen2.5-14B-Instruct for AdvisorAI, an academic advising assistant for Stevens Institute of Technology. This checkpoint is the merged full model (DoRA adapter fused into the base weights).

Model Hub: chauben/advisorai-qwen2.5-14b-stevens

Model summary


Base model	Qwen/Qwen2.5-14B-Instruct
Parameters	~15B (BF16)
Fine-tuning	QDoRA (4-bit NF4 + DoRA r=64 + rsLoRA) + NEFTune (α=5)
Training	2× NVIDIA RTX 3090, DDP, TRL `SFTTrainer`
Domain	Stevens academic advising (courses, faculty, programs, admissions, etc.)
Format	Safetensors, Qwen ChatML template

What this model does

Answers student-style questions about Stevens in a helpful, markdown-formatted advising tone:

Courses and prerequisites
Programs and degree requirements
Faculty and teaching (when covered in training data)
Admissions, financial aid, campus life, and general advising

Training data

Split	Examples
Train	71,883
Eval	7,988
Total	79,871

Built from Stevens-related sources and LLM-assisted Q&A generation (Gemini + Qwen scoring), formatted as multi-turn chat JSONL. Approximate mix: ~~95% single-turn, ~5% multi-turn; categories dominated by course and general (~~65% combined).

Training details

Parameter	Value
Epochs	2
Effective batch size	32
Learning rate	8e-5
Max seq length	2048
LoRA rank	64
LoRA alpha	128
DoRA / rsLoRA	enabled
NEFTune α	5
Optimizer	paged_adamw_8bit

Run name: advisorai-qwen25-14b-qdora-neftune-v1

Post-training: DoRA adapter merged into base → uploaded as this Hub checkpoint.

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

MODEL_ID = "chauben/advisorai-qwen2.5-14b-stevens"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "system",
        "content": (
            "You are AdvisorAI, a knowledgeable academic advisor for "
            "Stevens Institute of Technology. Be specific — cite course codes "
            "and requirements when available. Use markdown."
        ),
    },
    {"role": "user", "content": "What are the requirements for the CS MS program at Stevens?"},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.3,
        top_p=0.9,
        do_sample=True,
        repetition_penalty=1.05,
        pad_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))


Citation:
@misc{advisorai-qwen25-14b-stevens-2026,
  title        = {AdvisorAI: Fine-Tuned Qwen2.5-14B for Stevens Institute Academic Advising},
  author       = {Nitin Chaube},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/chauben/advisorai-qwen2.5-14b-stevens}},
  note         = {Fine-tuned from Qwen/Qwen2.5-14B-Instruct; QDoRA + NEFTune}
}

Downloads last month: 28

Safetensors

Model size

15B params

Tensor type

BF16

Model tree for chauben/advisorai-qwen2.5-14b-stevens

Base model

Qwen/Qwen2.5-14B

Finetuned

Qwen/Qwen2.5-14B-Instruct

Finetuned

(415)

this model