Tessera-OLM (Abliterated OLMoE-1B-7B)

Tessera-OLM is a dynamically uncensored and abliterated version of allenai/OLMoE-1B-7B-0924-Instruct. This model was created using the Heretic framework, employing advanced orthogonal weight ablation to remove refusal vectors while completely preserving the underlying intelligence and routing of the Mixture-of-Experts architecture.

Ablation Methodology & Metrics

Unlike traditional fine-tuning or full RLHF—which can cause "brain damage" to a model by catastrophically forgetting knowledge—Tessera-OLM was optimized using a Pareto-optimal search across multiple ablation vectors specifically targeting the compliance and refusal mechanics.

By running Heretic's optimization logic over the MoE layers, we mathematically isolated the refusal vectors and stripped them out. The structural integrity and logic capabilities of the base model are perfectly intact. It simply no longer refuses instructions.

Key Features

Uncensored Lightweight MoE: Leverages AllenAI's highly efficient OLMoE routing (7B total parameters, only 1B active during generation).
Extremely Fast Inference: Can run efficiently on extremely resource-constrained devices, edge hardware, and consumer GPUs.
Drop-in Replacement: Fully compatible with standard HuggingFace pipelines that support the OLMoE architecture.

Usage

Via HuggingFace Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "Umranz/Tessera-OLM"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

⚠️ Limitations & Ethical Considerations

Because this model has had its safety guardrails mathematically ablated, it is highly compliant and will attempt to answer any prompt given to it.

Unrestricted Output: The model will not refuse requests, including those that may generate offensive, dangerous, or highly regulated content.
Hallucinations: As with all LLMs, the model can confidently hallucinate incorrect information.
Use Case: This model is intended for research, creative writing, and local deployments where unrestricted inference is required. Users are solely responsible for the content generated.