Instructions to use SKT-NRS/NRS_QWEN_MYTHOS_1M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SKT-NRS/NRS_QWEN_MYTHOS_1M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SKT-NRS/NRS_QWEN_MYTHOS_1M") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("SKT-NRS/NRS_QWEN_MYTHOS_1M") model = AutoModelForMultimodalLM.from_pretrained("SKT-NRS/NRS_QWEN_MYTHOS_1M") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use SKT-NRS/NRS_QWEN_MYTHOS_1M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SKT-NRS/NRS_QWEN_MYTHOS_1M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SKT-NRS/NRS_QWEN_MYTHOS_1M", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SKT-NRS/NRS_QWEN_MYTHOS_1M
- SGLang
How to use SKT-NRS/NRS_QWEN_MYTHOS_1M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SKT-NRS/NRS_QWEN_MYTHOS_1M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SKT-NRS/NRS_QWEN_MYTHOS_1M", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SKT-NRS/NRS_QWEN_MYTHOS_1M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SKT-NRS/NRS_QWEN_MYTHOS_1M", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use SKT-NRS/NRS_QWEN_MYTHOS_1M with Docker Model Runner:
docker model run hf.co/SKT-NRS/NRS_QWEN_MYTHOS_1M
NRS NEURAL REASONING SYSTEM
SKT AI LABS
SKT NRS BOOSTED
1 Million Context • 100x High Reasoning Capacity
Base: Qwen 3.5 9B + Full NRS Neural Reasoning System
Ultra Fast • Deep Reasoning • Exceptional Coherence

🧠 100x REASONING 📜 1M CONTEXT ⚡ BLAZING FAST 💡 10x THINKING
SKT-NRS/NRS_QWEN_MYTHOS_1M
Qwen 3.5 9B + Full Neural Reasoning System
A powerful custom fine-tuned model built on Qwen 3.5 9B with full NRS treatment. Delivers exceptional reasoning depth while maintaining blazing fast speed and a massive 1 Million token context window.
✨ Key Capabilities
- 100x High Reasoning Capacity — Dramatically improved logical thinking via NRS Boosting.
- 10x Thinking Enhancement — Advanced step-by-step
<think>tags refined by SKT SFT. - 1 Million Token Context — Handle huge codebases & documents using YaRN scaling.
- Lightning Fast Inference — Optimized for consumer hardware (RTX 3090/4090).
- Native Tool Calling — Ready for Python execution & Web Search.
100x HIGH REASONING CAPACITY
Experience next-level reasoning in an efficient 9B package
Open Source • High Performance • Community Driven
REQUEST
REQUEST FROM USER - CLICK TO EXPAND
Show Who Requested to Boost this Model:
SKT NRS Team is working To Complete On All Request.
🛠️ How to Use & Run
This model is optimized for both Local Execution and Cloud Notebooks (Colab).
1. 🐍 Python (Transformers)
Install the required libraries:
pip install transformers torch accelerate
Basic Inference Code:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "SKT-NRS/NRS_QWEN_MYTHOS_1M"
# Load Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Prepare Prompt
messages = [
{"role": "system", "content": "You are NRS, an advanced reasoning assistant."},
{"role": "user", "content": "Explain quantum entanglement simply."}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(
**inputs,
max_new_tokens=4096,
temperature=0.6,
top_p=0.95,
do_sample=True
)
response = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
print(response)
2. ☁️ Google Colab Ready
Run this on a free T4 GPU or paid A100/V100 instances.
!pip install transformers accelerate bitsandbytes
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "SKT-NRS/NRS_QWEN_MYTHOS_1M"
# Load in 4-bit for memory efficiency
model = AutoModelForCausalLM.from_pretrained(
model_id,
load_in_4bit=True,
device_map="auto",
torch_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
print("Model Loaded Successfully! Ready for Reasoning.")
3. 🖥️ Local Running (Ollama / LM Studio)
For the best local experience, use GGUF quantizations.
Using Ollama:
ollama create nrs-mythos -f Modelfile
ollama run nrs-mythos
Using LM Studio:
- Download the
.gguffile from the "Files" section. - Drag and drop into LM Studio.
- Set Context Length to
8192or higher.
4. ⚡ High-Performance Serving (vLLM)
For production-grade speed and 1M context support:
pip install vllm
vllm serve SKT-NRS/NRS_QWEN_MYTHOS_1M \
--max-model-len 1000000 \
--gpu-memory-utilization 0.9 \
--dtype bfloat16
🧠 Technical Details & Training
Base Model
- Architecture: Qwen 3.5 9B
- Context Window: 1,048,576 tokens (via YaRN RoPE Scaling)
NRS Enhancement Process
The model underwent a rigorous Neural Reasoning System (NRS) enhancement pipeline:
- Reasoning Boosting Tool: Proprietary NRS tools generated high-quality Chain-of-Thought (CoT) data.
- Supervised Fine-Tuning (SFT): Tuned on ~500k high-quality reasoning samples (coding, math, logic).
- Tool Calling Optimization: Enhanced native function calling for Python & Web Search.
Sampling Parameters
- Temperature:
0.6 - Top_P:
0.95 - Top_K:
20 - Repetition Penalty:
1.05
⚠️ Limitations & Disclaimer
- Reasoning Mode: The model outputs
<think>blocks. Parse them if needed. - Uncensored Nature: Designed for open research. Use responsibly.
- Hallucinations: Always verify critical facts with external sources.
LICENSE AND TERMS
Made with ❤️ by SKT AI Labs
Pushing the boundaries of Open Source Reasoning.
- Downloads last month
- -