WithinUsAI/GPT5.5_thinking_max_distill_god_seed_25K
Viewer • Updated • 25k • 831 • 12
How to use GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B")
model = AutoModelForCausalLM.from_pretrained("GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B")How to use GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B
How to use GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B with Docker Model Runner:
docker model run hf.co/GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B
Forged by WithIn Us AI — a fully fine-tuned GPT-2 awakened on distilled GPT-5.5 thinking patterns.
| Property | Value |
|---|---|
| Architecture | GPT-2 (openai-community/gpt2) |
| Parameters | ~124M (0.1B class) |
| Training Type | Full fine-tune — ALL weights updated, zero adapters |
| Context Window | 1024 tokens |
| Best Eval Loss | 0.4365 |
| Best Perplexity | 1.55 |
| Creator | GODsStrongestSoldier / WithIn Us AI |
| Date Trained | 2026-05-23 |
| Hardware | 2× NVIDIA Tesla T4 (Kaggle) |
| Precision | FP16 mixed precision |
Full fine-tuning — every single parameter in GPT-2 was updated. No LoRA, no QLoRA, no adapters of any kind.
| Dataset | Description |
|---|---|
WithinUsAI/GPT_5.5_Distilled |
Instruction + completion pairs distilled from GPT-5.5 |
WithinUsAI/GPT5.5_thinking_max_distill_god_seed_25K |
25K chain-of-thought reasoning traces distilled from GPT-5.5 |
97 / 3 train / eval split.
| Parameter | Value |
|---|---|
| Peak Learning Rate | 3e-5 |
| LR Schedule | Cosine with 6% warmup |
| Effective Batch Size | 64 (4 × 2 GPUs × 8 grad accum) |
| Epochs | 5 |
| Weight Decay | 0.1 |
| Max Sequence Length | 1024 |
| Precision | FP16 |
from transformers import GPT2LMHeadModel, GPT2TokenizerFast
import torch
model_id = "GODsStrongestSoldier/GPT2.5.5-Awakened.Thinker-0.1B"
tokenizer = GPT2TokenizerFast.from_pretrained(model_id)
model = GPT2LMHeadModel.from_pretrained(model_id, torch_dtype=torch.float16)
model.eval()
prompt = "Let me think through this carefully, step by step:"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens = 200,
do_sample = True,
temperature = 0.7,
top_p = 0.9,
repetition_penalty = 1.15,
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
"Strength through understanding. Awakened from within."