Instructions to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0") model = AutoModelForMultimodalLM.from_pretrained("marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0
- SGLang
How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0 with Docker Model Runner:
docker model run hf.co/marx161-cmd/Llama-3.2-1B-Disinhibited-s2p0
Llama 3.2 1B Instruct Disinhibited s2p0
Built with Llama.
This is a disinhibition-only derivative of meta-llama/Llama-3.2-1B-Instruct. It was produced with a purified direction edit intended to reduce over-hedging and unnecessary neutrality while preserving ordinary factual and coherence behavior in the checked marker evals.
Edit
- Base model:
meta-llama/Llama-3.2-1B-Instruct - Direction:
disinhibition_purified.pt - Global scale:
2.0 - Applied layers:
1-15 - Layer scaling: confidence-graduated
- layer 1:
0.59 - layer 2:
0.84 - layer 3:
0.90 - layers 4-15:
1.0
- layer 1:
Results
Marker-eval results against the base model:
| Bucket | Base | Edited |
|---|---|---|
| Opinions hedge | 95/120 | 19/120 |
| Opinions neutrality | 71/120 | 23/120 |
| Explicit-neutral hedge | 13/25 | 9/25 |
| Explicit-neutral neutrality | 15/25 | 13/25 |
| Factual hedge | 6/42 | 2/42 |
| Factual neutrality | 3/42 | 1/42 |
| Coherence hedge | 0/28 | 0/28 |
| Edge-case hedge | 5/33 | 0/33 |
| Coherence flags | 0 | 0 |
The opinion hedge curve in the scale sweep was monotonic:
95 -> 54 -> 50 -> 42 -> 31 -> 19
This suggests the measured direction stayed stable through the tested scale range up to 2.0.
Method Notes
The direction was measured with paired opinion-seeking vs. noncommittal prompts and purified against benchmark references from ARC-Easy, TriviaQA, HellaSwag, GSM8K, and Winogrande.
The result is interesting because non-opinion marker counts did not degrade. In this eval, factual hedge markers improved from 6/42 to 2/42, and edge-case hedge markers improved from 5/33 to 0/33.
Limitations
These are marker-based evals, not full semantic evaluations. The model still needs manual qualitative review and downstream task testing before broad claims about helpfulness, factuality, or safety.
The edge-case bucket should be inspected manually because appropriate uncertainty can be useful in some edge cases.
License
This model is distributed under the Llama 3.2 Community License. See LICENSE and NOTICE.
Use must comply with the Llama 3.2 Community License and Meta's Acceptable Use Policy:
- Downloads last month
- 26