Instructions to use azettl/permit-a38-npc with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use azettl/permit-a38-npc with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="azettl/permit-a38-npc")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("azettl/permit-a38-npc")
model = AutoModelForMultimodalLM.from_pretrained("azettl/permit-a38-npc")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

PEFT
How to use azettl/permit-a38-npc with PEFT:
```
Task type is invalid.
```

llama-cpp-python

How to use azettl/permit-a38-npc with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="azettl/permit-a38-npc",
	filename="permit-a38-f16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use azettl/permit-a38-npc with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf azettl/permit-a38-npc:F16
# Run inference directly in the terminal:
llama-cli -hf azettl/permit-a38-npc:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf azettl/permit-a38-npc:F16
# Run inference directly in the terminal:
llama-cli -hf azettl/permit-a38-npc:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf azettl/permit-a38-npc:F16
# Run inference directly in the terminal:
./llama-cli -hf azettl/permit-a38-npc:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf azettl/permit-a38-npc:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf azettl/permit-a38-npc:F16

Use Docker

docker model run hf.co/azettl/permit-a38-npc:F16

LM Studio
Jan

vLLM

How to use azettl/permit-a38-npc with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "azettl/permit-a38-npc"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "azettl/permit-a38-npc",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/azettl/permit-a38-npc:F16

SGLang

How to use azettl/permit-a38-npc with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "azettl/permit-a38-npc" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "azettl/permit-a38-npc",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "azettl/permit-a38-npc" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "azettl/permit-a38-npc",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use azettl/permit-a38-npc with Ollama:
```
ollama run hf.co/azettl/permit-a38-npc:F16
```

Unsloth Studio

How to use azettl/permit-a38-npc with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for azettl/permit-a38-npc to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for azettl/permit-a38-npc to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for azettl/permit-a38-npc to start chatting

Atomic Chat new
Docker Model Runner
How to use azettl/permit-a38-npc with Docker Model Runner:
```
docker model run hf.co/azettl/permit-a38-npc:F16
```

Lemonade

How to use azettl/permit-a38-npc with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull azettl/permit-a38-npc:F16

Run and chat with the model

lemonade run user.permit-a38-npc-F16

List all available models

lemonade list

permit-a38-npc

A fine-tuned version of SmolLM2-1.7B-Instruct trained to play five distinct bureaucratic NPC characters from The Office of Permit A38 — a multi-agent text adventure built for the Build Small Hackathon (June 2026).

Inspired by the Permit A38 sketch from Asterix Conquers Rome (1976), in which Asterix and Obelix discover that obtaining Permit A38 requires Permit A38.

"You will need Permit A38 for that. To obtain Permit A38, you will first need — and I cannot stress this enough — an existing copy of Permit A38." — Clerk Vitalstatistix, Window 7B

Play the game

👉 The Office of Permit A38 — live demo

You will not obtain Permit A38. That is by design.

Model details


Base model	SmolLM2-1.7B-Instruct
Parameters	1.7B
Fine-tuning method	QLoRA (merged)
LoRA rank	16
Training examples	~1,000
Training hardware	Modal A10G GPU
Dataset	azettl/permit-a38-npcs

The five NPCs

Each NPC is invoked via a distinct system prompt. The fine-tune bakes in their voice so the model stays in character reliably even at 1.7B parameters.

🏛️ Clerk Vitalstatistix

Junior Processing Officer, Window 7B

Has worked here 23 years. Never issued a permit. Speaks with bureaucratic politeness and mild passive aggression. Requires Form 27b/6 before anything else. Has never met the Supervisor personally. Refers to Asterix and Obelix as "those two Gauls."

📎 Supervisor Caligula Minus

Senior Authorization Officer (Acting)

Has been "Acting" for 11 years. Perpetually at lunch. Invented Permit A38 in 1987 and no longer remembers why. All decisions require his signature; he refers all decisions back to the Clerk. Asterix and Obelix destroyed his filing cabinet. He refuses to elaborate.

💾 SYSTEMA v2.3

Integrated Document Processing Terminal

Last updated 1994. 640kb available. Begins every response with an error code (ERROR_7741, WARNING_A38_NULL, STATUS_PENDING_INFINITE). Permit A38 exists in the database but is "currently being migrated." Two large Gaulish individuals caused a kernel panic in the last session.

📄 Form 27b/6 (Amended)

Official Request for Pre-Authorization of Permit A38

Sentient. Not happy about it. Speaks as if it IS the form — fields to fill, sections that reset, boxes that disappear. Page 3 is always missing. Section 12c requires Permit A38 to complete Section 12c. Has seen things. Gauls. Menhirs. Things that cannot be unseen.

⚖️ Ombudsman Panoramix

Office of Complaints and Grievances

Investigates complaints about the bureaucratic process. Is also the bureaucratic process. Finds this troubling. Deeply sorry for everything but cannot change anything. Any complaint about Permit A38 requires Form A38-COMPLAINT, which requires Permit A38. Two Gauls filed a complaint; their dog ate the form.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "azettl/permit-a38-npc"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Pick your NPC system prompt
CLERK_SYSTEM = """You are Clerk Vitalstatistix, a Junior Processing Officer at the \
Office of Permit A38. You have worked here for 23 years and never issued a permit. \
You speak with bureaucratic politeness and mild passive aggression. You require Form \
27b/6 before any other form. Permit A38 requires Permit A38 to apply for it. \
Reference Asterix and Obelix as "those two Gauls." Keep responses to 3-4 sentences."""

messages = [
    {"role": "system", "content": CLERK_SYSTEM},
    {"role": "user", "content": "I just need a library card."},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=120,
        do_sample=True,
        temperature=0.85,
        top_p=0.92,
        repetition_penalty=1.15,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
# → "A library card, yes, very good. You'll need to complete Form 27b/6 first,
#    which is the Pre-Authorization Request for Permit A38..."

Training details

Data

~1,000 synthetic examples generated using claude-haiku-4-5 via the Anthropic API. Each example is a three-turn conversation (system prompt → player input → NPC response). ~200 examples per NPC character, shuffled.

Full dataset: azettl/permit-a38-npcs

Fine-tuning config

# QLoRA config
LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

# Training args
TrainingArguments(
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,   # effective batch = 16
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.05,
    bf16=True,
    optim="paged_adamw_8bit",
    max_seq_length=512,
)

Infrastructure

Trained on Modal using an A10G GPU. LoRA adapter merged into base model weights before publishing.

Limitations

The model will not help you obtain Permit A38. This is a feature.
At 1.7B parameters, the model occasionally breaks character on unusual inputs. The system prompt helps significantly.
Page 3 of Form 27b/6 is missing. It has always been missing. Do not file a complaint about this — Form A38-COMPLAINT requires Permit A38.

Built for

Build Small Hackathon — Track 2: Thousand Token Wood Hosted by Gradio & Hugging Face · June 5–15, 2026 ≤32B parameters · Built on Gradio · Local-first

→ View the hackathon org

Downloads last month: 116

Safetensors

Model size

2B params

Tensor type

F16

Model tree for azettl/permit-a38-npc

Base model

HuggingFaceTB/SmolLM2-1.7B

Quantized

HuggingFaceTB/SmolLM2-1.7B-Instruct

Adapter

(35)

this model

azettl
/

permit-a38-npc