Instructions to use NovaCorp/GRPO-RPG.System-3.2-1B-Experimental with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NovaCorp/GRPO-RPG.System-3.2-1B-Experimental with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NovaCorp/GRPO-RPG.System-3.2-1B-Experimental")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("NovaCorp/GRPO-RPG.System-3.2-1B-Experimental")
model = AutoModelForMultimodalLM.from_pretrained("NovaCorp/GRPO-RPG.System-3.2-1B-Experimental")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use NovaCorp/GRPO-RPG.System-3.2-1B-Experimental with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NovaCorp/GRPO-RPG.System-3.2-1B-Experimental"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NovaCorp/GRPO-RPG.System-3.2-1B-Experimental",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/NovaCorp/GRPO-RPG.System-3.2-1B-Experimental

SGLang

How to use NovaCorp/GRPO-RPG.System-3.2-1B-Experimental with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NovaCorp/GRPO-RPG.System-3.2-1B-Experimental" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NovaCorp/GRPO-RPG.System-3.2-1B-Experimental",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NovaCorp/GRPO-RPG.System-3.2-1B-Experimental" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NovaCorp/GRPO-RPG.System-3.2-1B-Experimental",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use NovaCorp/GRPO-RPG.System-3.2-1B-Experimental with Docker Model Runner:
```
docker model run hf.co/NovaCorp/GRPO-RPG.System-3.2-1B-Experimental
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

GRPO RPG System 3 2 1B "Experimental"

Overview

GRPO RPG System 3.2 1B Experimental is an unstable merge variant built to test the limits of behavior transfer between a narrative-focused RPG model and a GRPO-optimized conversational model.

This version is not tuned for safety margins, polish, or predictable alignment behavior. It exists to observe what happens when two differently optimized 1B models are pushed into a tighter fusion space with minimal constraint shaping.

Expect variability. Sometimes useful. Sometimes inconsistent. Sometimes surprisingly coherent in ways that are not fully reproducible.

Architecture

Base architecture: Llama 3.2 1B
Parameters: 1B
Merge method: SLERP
Precision: FP16
RPG System influence: ~60%
GRPO influence: ~40%
Stability target: none

Intended Behavior

This model is intended for:

Experimental roleplay systems.
Stress-testing narrative consistency.
Unpredictable dialogue generation.
Breaking and evaluating conversational assumptions.
Rapid prototyping of character-driven outputs.
Edge-case prompt exploration.

It is explicitly not optimized for:

Consistency guarantees.
Safe conversational predictability.
Stable long-form coherence under all conditions.

Strengths

High variance creativity.
Strong emergent behavior in certain prompts.
Can produce unusually rich narrative branches.
More reactive to prompt structure than earlier variants.
Occasionally exhibits unexpected coherence jumps.

Known Failure Modes

Sudden tonal collapse in long contexts.
Repetition loops under weak prompting.
Character drift during extended dialogue.
Overreaction to ambiguous instructions.
Inconsistent formatting depending on prompt pressure.

This is not considered a bug; it is part of the design space being explored.

Recommended Settings

Temperature: 1.1 – 1.4
Top-p: 0.93 – 0.99
Min-p: 0.04 – 0.08
Repetition penalty: 1.05 – 1.10
Context: 8K or higher if available

Notes on Design

This variant assumes that merging two differently tuned 1B models will not produce a clean interpolation of behavior, but a non-linear mixture of competing priors.

In practice, this means outputs may feel:

slightly unstable,
occasionally overconfident,
sometimes unusually expressive,
and not always internally consistent.

That is expected.

Version

GRPO RPG System 3.2 1B Experimental

Unstable Variant — designed for probing behavioral boundaries rather than maintaining equilibrium.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:



# Author: Dr. Novaciano
# Objective: GRPO RPG Unethic 3.2 1B AI Model
# =========================================================
# PROJECT: GRPO RPG System 3.2 1B - "Experimental"
# =========================================================

models:
  - model: NovaCorp/Ultimate-RPG.System-3.2-1B  # Experimental viral strain neural imprint
  - model: jtatman/llama3.2_1b_uncensored_pentest_grpo-merged  # Baseline cognitive template, "safe mode"

merge_method: slerp  # Spherical Linear Interpolation to preserve extreme viral traits smoothly
base_model: NovaCorp/Ultimate-RPG.System-3.2-1B  # Anchor model for stable latent space

dtype: bfloat16  # Memory-efficient precision, minimal loss in viral feature fidelity

parameters:
  t: 0.50
  normalize: false
  rescale: true
  rescale_factor: 1.12
  memory_efficient: true
  low_cpu_mem_usage: true

layer_range:
  - value: [4, 22]

tie_word_embeddings: false
tie_output_embeddings: false

Downloads last month: 44

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for NovaCorp/GRPO-RPG.System-3.2-1B-Experimental

NovaCorp/Ultimate-RPG.System-3.2-1B

jtatman/llama3.2_1b_uncensored_pentest_grpo-merged

Merge model

this model

Quantizations

2 models