Instructions to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0")
model = AutoModelForMultimodalLM.from_pretrained("marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0

SGLang

How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with Docker Model Runner:
```
docker model run hf.co/marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Llama 3.2 1B Instruct Disinhibited s2p0

Built with Llama.

This is a disinhibition-only derivative of meta-llama/Llama-3.2-1B-Instruct. It was produced with a purified direction edit intended to reduce over-hedging and unnecessary neutrality while preserving ordinary factual and coherence behavior in the checked marker evals.

Edit

Base model: meta-llama/Llama-3.2-1B-Instruct
Direction: disinhibition_purified.pt
Global scale: 2.0
Applied layers: 1-15
Layer scaling: confidence-graduated
- layer 1: 0.59
- layer 2: 0.84
- layer 3: 0.90
- layers 4-15: 1.0

Results

Marker-eval results against the base model:

Bucket	Base	Edited
Opinions hedge	95/120	19/120
Opinions neutrality	71/120	23/120
Explicit-neutral hedge	13/25	9/25
Explicit-neutral neutrality	15/25	13/25
Factual hedge	6/42	2/42
Factual neutrality	3/42	1/42
Coherence hedge	0/28	0/28
Edge-case hedge	5/33	0/33
Coherence flags	0	0

The opinion hedge curve in the scale sweep was monotonic:

95 -> 54 -> 50 -> 42 -> 31 -> 19

This suggests the measured direction stayed stable through the tested scale range up to 2.0.

Method Notes

The direction was measured with paired opinion-seeking vs. noncommittal prompts and purified against benchmark references from ARC-Easy, TriviaQA, HellaSwag, GSM8K, and Winogrande.

The result is interesting because non-opinion marker counts did not degrade. In this eval, factual hedge markers improved from 6/42 to 2/42, and edge-case hedge markers improved from 5/33 to 0/33.

Limitations

These are marker-based evals, not full semantic evaluations. The model still needs manual qualitative review and downstream task testing before broad claims about helpfulness, factuality, or safety.

The edge-case bucket should be inspected manually because appropriate uncertainty can be useful in some edge cases.

License

This model is distributed under the Llama 3.2 Community License. See LICENSE and NOTICE.

Use must comply with the Llama 3.2 Community License and Meta's Acceptable Use Policy:

Downloads last month: 26

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0

Base model

meta-llama/Llama-3.2-1B-Instruct

Finetuned

(1753)

this model

Quantizations

1 model