Instructions to use PillowTa1k/NaviGen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use PillowTa1k/NaviGen with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("NaviGen-stage2-base")
model = PeftModel.from_pretrained(base_model, "PillowTa1k/NaviGen")

Transformers

How to use PillowTa1k/NaviGen with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="PillowTa1k/NaviGen")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("PillowTa1k/NaviGen", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use PillowTa1k/NaviGen with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "PillowTa1k/NaviGen"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PillowTa1k/NaviGen",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/PillowTa1k/NaviGen

SGLang

How to use PillowTa1k/NaviGen with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "PillowTa1k/NaviGen" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PillowTa1k/NaviGen",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "PillowTa1k/NaviGen" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PillowTa1k/NaviGen",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use PillowTa1k/NaviGen with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for PillowTa1k/NaviGen to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for PillowTa1k/NaviGen to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for PillowTa1k/NaviGen to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="PillowTa1k/NaviGen",
    max_seq_length=2048,
)

Docker Model Runner
How to use PillowTa1k/NaviGen with Docker Model Runner:
```
docker model run hf.co/PillowTa1k/NaviGen
```

NaviGen GRPO Adapter - step600

This repository contains the GRPO-trained LoRA adapter used by NaviGen, a personalized generative recommendation model for producing user-aware image and video generation instructions.

NaviGen represents each item with a dual identifier that couples a collaborative code and a textual code in one token stream. This adapter is the reinforcement learning stage of the NaviGen pipeline: it further aligns the stage-2 supervised model with user intent through reward-guided optimization.

Model Details

Model name: NaviGen GRPO Adapter, step600
Model type: PEFT LoRA adapter for causal language modeling
Base model: NaviGen-stage2-base
Backbone family: Qwen3-style causal LM
Training stage: GRPO reinforcement learning after two-stage SFT
Adapter format: adapter_model.safetensors
PEFT version: 0.19.1

The adapter targets the main attention and MLP projection layers:

q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Intended Use

This adapter is intended for research on personalized generative recommendation, especially settings where a model should infer user preference from historical item identifiers and produce more specific, relevant, and visually generatable generation instructions.

Typical uses include:

Personalized prompt or instruction generation for image/video models
Next-item or identifier prediction under the NaviGen token format
Reproduction and analysis of the NaviGen RL stage
Ablation studies comparing SFT and GRPO-aligned checkpoints

This adapter is not a standalone model. It must be loaded on top of the corresponding NaviGen stage-2 base model.

Quick Start

Install the main dependencies:

pip install torch transformers peft safetensors

Load the adapter with PEFT:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "NaviGen-stage2-base"
adapter_id = "NaviGen-grpo-step600"

tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()

Replace base_model_id and adapter_id with the final repository names used in your release.

Input Format

The adapter follows the NaviGen training format. Inputs should use the same tokenizer and special tokens released with this checkpoint. In general, prompts contain user history, item identifiers, and task instructions serialized in the NaviGen token stream.

For reproducibility, use the tokenizer files included in this repository:

tokenizer.json
tokenizer_config.json
special_tokens_map.json
added_tokens.json
chat_template.jinja
vocab.json
merges.txt

Training Summary

NaviGen uses a two-stage SFT + RL pipeline:

Stage-1 SFT: learns item identifier and preference-aware representations.
Stage-2 SFT: distills preference reasoning and instruction writing from searched supervision.
GRPO alignment: optimizes the model with hierarchical and self-consistent rewards to better match user intent and generation quality.

This checkpoint corresponds to the GRPO adapter saved at training step 600.

Limitations

The adapter depends on the matching NaviGen base model and tokenizer.
Outputs are sensitive to the exact prompt format and identifier vocabulary.
The model is designed for research use and has not been audited for all production safety requirements.
Generated instructions may still contain irrelevant, underspecified, or visually difficult content.

Files

Core files for inference:

adapter_config.json
adapter_model.safetensors
tokenizer and chat template files

Training-resume states such as optimizer or scheduler checkpoints are not required for normal inference.

Citation

If you use this model, please cite the NaviGen paper once the citation is released.

@article{navigen,
  title   = {NaviGen: Personalized Generative Recommendation with Dual Identifiers},
  author  = {NaviGen Authors},
  journal = {TBA},
  year    = {2026}
}

Downloads last month: 21