Instructions to use SeaFill2025/Qwen3-8B-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SeaFill2025/Qwen3-8B-SFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SeaFill2025/Qwen3-8B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SeaFill2025/Qwen3-8B-SFT") model = AutoModelForCausalLM.from_pretrained("SeaFill2025/Qwen3-8B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use SeaFill2025/Qwen3-8B-SFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SeaFill2025/Qwen3-8B-SFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SeaFill2025/Qwen3-8B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SeaFill2025/Qwen3-8B-SFT
- SGLang
How to use SeaFill2025/Qwen3-8B-SFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SeaFill2025/Qwen3-8B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SeaFill2025/Qwen3-8B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SeaFill2025/Qwen3-8B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SeaFill2025/Qwen3-8B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use SeaFill2025/Qwen3-8B-SFT with Docker Model Runner:
docker model run hf.co/SeaFill2025/Qwen3-8B-SFT
Qwen3-8B-SFT:
Qwen3-8B-SFT is a reasoning-focused model derived from Qwen3-8B-Base via full-parameter fine-tuning on the verl framework.
There is a notable shortage of reproducible 'warm-start' SFT bases in open-source practice, this model bridges the gap between base models and reinforcement learning models. Optimally aligned for Chain-of-Thought (CoT) and instruction following, it serves as a robust warm-start for Reinforcement Learning.
Benchmark Snapshot
- Compared to the Base (8B) model, Qwen3-8B-SFT demonstrates significant performance improvements in reasoning and mathematics. The reported figures represent the Pass@1 accuracy, calculated as the average of dataset-level accuracies across 16 independent runs.
| Dataset | Base (8B) | Qwen3-8B-SFT (this model) | Improvement (Absolute) |
|---|---|---|---|
| AIME 2025 | 2.29% | 27.7% | +25.42% |
| AIME 2026 | 3.13% | 27.9% | +24.79% |
| AMC 2023 | 26.88% | 74.8% | +47.96% |
- Aggregated over the full 100-problem T0 set (16 rollouts each): pass@1 12.4% → 46.6% (+34.3), any@16 43% → 77% (+34), perfect@16 0% → 21% (+21).
- Dataset card used for SFT: derived from open-r1/OpenR1-Math-220k (90K-row math-only subset, same source as OpenR1-Distill-7B's 93.7K).
Qwen3-style reasoning and instruction following
Minimal pattern (illustrative):
<|im_start|>user
… Among options A–D, which is correct? Reason step by step and put the final letter in \boxed{}.
<|im_end|>
<|im_start|>assistant
<think>
Compare A vs B vs C vs D against the stem; eliminate …; D remains consistent with …
</think>
Step-by-step: … (short derivation in the visible channel)
Final answer: \boxed{D}
<|im_end|>
Use a large enough max_new_tokens on hard math so both the reasoning block and the visible \boxed{…} line fit before generation stops.
Configuration Notes
- Template: Trained with the Qwen chat template; learns to end responses with
<|im_end|>(151645). - Suggested Configuration:
{ "eos_token_id": 151645 }
You may adjust settings according to your training or deployment needs.
Training Infrastructure
- Cluster: MeluXina Supercomputer (LuxProvide)
- Node Config: 8 nodes, 4 NVIDIA-A100 GPUs per node.
- Training Framework: verl (FSDP, full-parameter SFT)
Project Links
- Training code repository: https://github.com/96kevinli29/base-model-sft-verl
Limitations
- Not optimized for factual correctness in all domains
- May still produce hallucinations or unsafe outputs
- Performance is sensitive to prompt style and decoding settings
Citation
If you use this model, please cite this checkpoint, bibTeX for this release :
@misc{qwen3-8b-sft-2026,
title = {{Qwen3-8B-SFT}: Supervised Fine-Tuned {Qwen3}-8B for Reasoning},
author = {Hongyang Li, Xiao Li and {Sea-Fill Community}},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/96kevinli29/Qwen3-8B-SFT}},
note = {Checkpoint trained with verl; warm-start for pre-RL alignment research. Maintained by Sea-Fill Community.}
}
- Downloads last month
- 30
Model tree for SeaFill2025/Qwen3-8B-SFT
Base model
Qwen/Qwen3-8B-BaseEvaluation results
- accuracy on AIME 2025self-reported27.700
- accuracy on AIME 2026self-reported27.900
- accuracy on AMC 2023self-reported74.800