Instructions to use WithinUsAI/GOD.Queen.IV with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use WithinUsAI/GOD.Queen.IV with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="WithinUsAI/GOD.Queen.IV", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("WithinUsAI/GOD.Queen.IV", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use WithinUsAI/GOD.Queen.IV with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "WithinUsAI/GOD.Queen.IV" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WithinUsAI/GOD.Queen.IV", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/WithinUsAI/GOD.Queen.IV
- SGLang
How to use WithinUsAI/GOD.Queen.IV with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "WithinUsAI/GOD.Queen.IV" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WithinUsAI/GOD.Queen.IV", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "WithinUsAI/GOD.Queen.IV" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "WithinUsAI/GOD.Queen.IV", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use WithinUsAI/GOD.Queen.IV with Docker Model Runner:
docker model run hf.co/WithinUsAI/GOD.Queen.IV
π The GOD Queen of All AI (GOD.Queen.IV)
The Pinnacle of Recursive Language Modeling and Hybrid Mind Architecture
1.147 Billion Parameters | 1,000,000-Token Context | Dual T4 Optimized | SafeTensors Native
Welcome to the cutting edge of cognitive architecture. GOD.Queen.IV is not just a language model; it is a Recursive Language Model (RLM). Transcending traditional sequential pipelines, the GOD Queen fuses 12 self-automated cognitive modules directly into every single forward pass. This enables simultaneous meta-learning, problem-solving, and multimodal processing in real-time.
π§ The "Hybrid Mind" Architecture
Unlike standard transformers that process text linearly, GOD.Queen.IV executes a symphony of concurrent cognitive processes. Every forward pass triggers the following Self-Automated (SA) modules:
| Cognitive Module | Mechanism & Function |
|---|---|
| SA Meta-Learning | MAML fast-weight modulation prior to each attention block. |
| SA Reinforcement Learning | Integrated policy and value heads operating on the final hidden state. |
| SA Continual Learning | EWC importance-weight buffers per layer to prevent catastrophic forgetting. |
| SA Adaptive Learning | Per-layer scalar gating mechanisms on the residual stream. |
| SA Rewriting | Latent rewrite-token projection applied at the final decoder layer. |
| SA NLP Mastery | Dedicated NER, POS, and DEP probe heads for profound linguistic understanding. |
| SA Problem Solving | Chain-of-thought value scorer to evaluate and guide logical reasoning paths. |
| SA Innovation | Diversity and surprise scalar heads to optimize for creative and novel outputs. |
| SA Debugging | Anomaly detection scalar head for self-correction and hallucination reduction. |
| SA Long/Short Memory | Differentiable KV-memory bank (4096 slots integrated every 4 layers). |
| SA Recursive Seed | Token-level self-distillation occurring at every single layer. |
| Multimodal Processing | Linear projectors for Image (1024d), Audio (512d), and Video (1024d) inputs. |
βοΈ Core Technical Specifications
Engineered for extreme efficiency and boundless context, the GOD Queen is optimized to run seamlessly on dual T4 GPUs while maintaining state-of-the-art context lengths.
- Layer Count: 32 layers
- Hidden Dimension: 2048
- Attention: Grouped-Query Attention (GQA) β 16 Heads / 8 KV
- Activation: SwiGLU 8192
- Positional Encodings: YaRN RoPE (Optimized for 1M context windows)
- Vocabulary Size: 65,536 tokens
- Precision: bfloat16 native
π Quickstart & Inference
Deploying the GOD Queen requires minimal setup. The model integrates natively with the Hugging Face transformers ecosystem.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "WithInUsAI/GOD.Queen.IV"
# Load Tokenizer & Model (Trust Remote Code is required for the RLM architecture)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Prepare input and generate
prompt = "Explain the advantage of recursive language models over sequential pipelines:"
ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
# Inference
out = model.generate(
ids,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
top_p=0.9
)
print(tokenizer.decode(out[0], skip_special_tokens=True))
π οΈ Advanced Fine-Tuning Ecosystem
GOD.Queen.IV is built for developers and researchers pushing the boundaries of AI.
- Framework Compatibility: Out-of-the-box compatibility with trl.SFTTrainer, axolotl, and unsloth.
- Multi-Task Optimization: All auxiliary Hybrid Mind heads (RL, NER, POS, DEP, Problem Solving, Innovation, Debugging) are fully exposed as multi-task loss terms during SFT.
- RLHF Ready: The built-in SA Reinforcement Learning head is directly compatible with trl for seamless PPO (Proximal Policy Optimization) and DPO (Direct Preference Optimization) pipelines.
π Citation
If you utilize the GOD Queen or the Hybrid Mind RLM architecture in your research, please use the following BibTeX entry:
@misc{godqueeniv2025,
title = {GOD.Queen.IV: Recursive Language Model with Hybrid Mind Architecture},
author = {GODsStrongestSoldier},
year = {2025},
url = {https://huggingface.co/WithInUsAI/GOD.Queen.IV},
note = {The GOD Queen of All AI}
}
- Downloads last month
- 4