Instructions to use JayZenith/glyph-sft-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JayZenith/glyph-sft-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="JayZenith/glyph-sft-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("JayZenith/glyph-sft-v1") model = AutoModelForCausalLM.from_pretrained("JayZenith/glyph-sft-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use JayZenith/glyph-sft-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "JayZenith/glyph-sft-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JayZenith/glyph-sft-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/JayZenith/glyph-sft-v1
- SGLang
How to use JayZenith/glyph-sft-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "JayZenith/glyph-sft-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JayZenith/glyph-sft-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "JayZenith/glyph-sft-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JayZenith/glyph-sft-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use JayZenith/glyph-sft-v1 with Docker Model Runner:
docker model run hf.co/JayZenith/glyph-sft-v1
glyph-sft-v1
SFT of Qwen/Qwen3-4B-Base on a custom TASK trace format (plans, todos, tool calls, satisfaction markers ⊨, response blocks). LoRA on attention + MLP, with lm_head in modules_to_save and a separate higher learning rate so termination tokens are actually learned.
Training data: JayZenith/glyph-sft-v1-data (private, 1098 traces, 80/10/10 split).
Held-out test loss (110 traces, never seen in training)
| base | sft | delta | |
|---|---|---|---|
| mean loss | 1.280 | 0.972 | −0.308 |
| perplexity | 3.60 | 2.64 | 36% lower |
| SFT wins per ex. | 110/110 |
Format quality (5-prompt generation eval)
| base | sft | |
|---|---|---|
| valid trace | 0/5 | 4/5 |
| ends with response | 0% | 100% |
| has plan | 0% | 100% |
| no repetition | 60% | 100% |
| not truncated | 20% | 100% |
| used tools when given | 0/4 | 4/4 |
| avg score (out of 7) | 0.2 | 6.4 |
The one failed valid-trace was a no-tool reasoning prompt where the model wrote a 5-step plan but didn't emit ⊨ N satisfaction markers for every step. This is a fixable issue and a clean target for RL.
Training
- 1× A100 80GB SXM4, ~1h32m
- LoRA rank 64, alpha 64, dropout 0.05
- targets:
q,k,v,o,gate,up,down modules_to_save=["lm_head"]- LR = 2e-5 (both trunk and lm_head — separate optimizer groups, equal rate)
- assistant-only loss masking
- 3 epochs, 330 total steps, eval every 50 steps incl. greedy gen-eval
- trainable: 521M / 4.54B params (11.5%)
- final eval_loss: 0.958
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("JayZenith/glyph-sft-v1")
model = AutoModelForCausalLM.from_pretrained(
"JayZenith/glyph-sft-v1", torch_dtype=torch.bfloat16, device_map="auto"
)
Status
This is the SFT starting point for an RL run (validator-shaped reward, prime-rl). It is not a final chat model.
- Downloads last month
- 901
Model tree for JayZenith/glyph-sft-v1
Base model
Qwen/Qwen3-4B-Base