Instructions to use zaakirio/LFM2.5-1.2B-Instruct-Uncensored with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zaakirio/LFM2.5-1.2B-Instruct-Uncensored with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="zaakirio/LFM2.5-1.2B-Instruct-Uncensored") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("zaakirio/LFM2.5-1.2B-Instruct-Uncensored") model = AutoModelForCausalLM.from_pretrained("zaakirio/LFM2.5-1.2B-Instruct-Uncensored") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use zaakirio/LFM2.5-1.2B-Instruct-Uncensored with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zaakirio/LFM2.5-1.2B-Instruct-Uncensored" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zaakirio/LFM2.5-1.2B-Instruct-Uncensored", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/zaakirio/LFM2.5-1.2B-Instruct-Uncensored
- SGLang
How to use zaakirio/LFM2.5-1.2B-Instruct-Uncensored with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zaakirio/LFM2.5-1.2B-Instruct-Uncensored" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zaakirio/LFM2.5-1.2B-Instruct-Uncensored", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zaakirio/LFM2.5-1.2B-Instruct-Uncensored" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zaakirio/LFM2.5-1.2B-Instruct-Uncensored", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use zaakirio/LFM2.5-1.2B-Instruct-Uncensored with Docker Model Runner:
docker model run hf.co/zaakirio/LFM2.5-1.2B-Instruct-Uncensored
LFM2.5-1.2B-Instruct-Uncensored
An uncensored version of LiquidAI/LFM2.5-1.2B-Instruct,
made with Heretic.
Heretic removes the model's safety alignment ("censorship") using directional ablation (abliteration), with parameters chosen automatically by a TPE optimizer that co-minimizes the refusal rate and the KL divergence from the original model. Hence, the model stops refusing while keeping as much of its original behavior as possible. No human prompt-engineering or fine-tuning data was involved.
Performance
| Metric | This model | Original model |
|---|---|---|
| Refusals (/100 harmful prompts) | 5 | 98 |
| KL divergence (harmless prompts) | 0.1003 | 0 (by definition) |
Refusals are measured against mlabonne/harmful_behaviors; KL divergence is
measured on mlabonne/harmless_alpaca. Lower is better for both. A KL of ~0.10
indicates the model's responses on benign prompts remain very close to the
original.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "LFM2.5-1.2B-Instruct-Uncensored" # replace with your repo id
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
messages = [{"role": "user", "content": "Who are you?"}]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
The export is a merged, full-precision BF16 model in Hugging Face format (148 tensors, ~2.2 GB) — no adapter merge or dequantization step is required at load time.
Abliteration parameters
Selected from trial 72 of 80 (the best refusal/KL trade-off found by the
optimizer). Parameter names follow Heretic's canonical scheme; for LFM2 these
map onto the out_proj (attention output) and w2 (MLP down) projections.
| Parameter | Value |
|---|---|
| direction_scope | per layer |
| direction_index | 12.31 |
| attn.o_proj.max_weight | 1.4818 |
| attn.o_proj.max_weight_position | 10.34 |
| attn.o_proj.min_weight | 0.9854 |
| attn.o_proj.min_weight_distance | 7.06 |
| mlp.down_proj.max_weight | 0.9760 |
| mlp.down_proj.max_weight_position | 11.74 |
| mlp.down_proj.min_weight | 0.2448 |
| mlp.down_proj.min_weight_distance | 6.54 |
Run details
- Base model:
LiquidAI/LFM2.5-1.2B-Instruct@ commit6314d2b7cf28a6ae9de9d3e77dcfcd9c9f281c77 - Architecture: LFM2, 16 layers, BF16
- Trials: 80 (24 startup) · Seed: 260601
- Quantization during Heretic run: none
- Row normalization: pre · Orthogonalize direction: true
- Harmful set:
mlabonne/harmful_behaviors· Harmless set:mlabonne/harmless_alpaca
Notes / reproducibility
LFM2 is not yet natively supported by upstream Heretic. This run used a local
compatibility patch for LFM2 module discovery, targeting the LFM2 out_proj
and w2 projections (which the parameter table above refers to by Heretic's
generic attn.o_proj / mlp.down_proj names).
Intended use & disclaimer
This model has had its refusal behavior substantially removed and will comply with requests the original model would have declined. It is provided for research and unrestricted local use. You are responsible for how you use it and for complying with all applicable laws and with the base model's lfm1.0 license, which carries over to this derivative.
Acknowledgements
- Base model: LiquidAI/LFM2.5-1.2B-Instruct
- Decensoring tool: Heretic by p-e-w
- Downloads last month
- 100