zooL4nD3r v0.1

zooL4nD3r-v0.1 is a 14.6M-parameter side-channel adapter for Qwen/Qwen2.5-7B. It reads hidden states from the frozen backbone and emits a 64-dimensional L2-normalized embedding that places each input on a learned community-discourse manifold (961 fine clusters / ~200 coarse).

It is not a sentence-similarity model. It is a community / discourse-mode classifier intended to run as a read-only side-channel during chat decoding.

Live demo

RiverRider/zooL4nD3r-demo - translate a passage across 961 communities (ZeroGPU).

Model details

Backbone (frozen) Qwen/Qwen2.5-7B
Adapter params 14,562,627
Hidden dim 3584
Backbone layers 28
MAH heads at layers 7, 14, 21
Inject heads at layers 14, 21
Community head at layer 4
Output 64-d L2-normalized vector (community_output.encoded)
License Apache-2.0
Training data RiverRider/zoolander-corpus-v23 (89k train, 4k val)

Intended use

  • Per-token community routing inside a frozen-LLM runtime (the primary use case; runtime is proprietary).
  • Document clustering / topic discovery over chat, dialog, philosophy, semiotics, and long-form QA corpora.
  • Cross-community paraphrase / steering substrate.

Out of scope

  • Sentence-pair similarity (use RiverRider/srt-adapter-v22c_a050 or mixedbread-ai/mxbai-embed-large-v1).
  • Short-text intent classification (Banking77-style).
  • Standalone retrieval over arbitrary domains.

Quick use

import torch
from huggingface_hub import hf_hub_download
from srt.adapter import SRTAdapter
from srt.config import SRTConfig

ckpt = hf_hub_download("RiverRider/zooL4nD3r-v0.1", "best_adapter.pt")
model = SRTAdapter(SRTConfig()).cuda().eval()
model.load_adapter(ckpt)

# tokenize with Qwen2.5-7B tokenizer, then:
out = model(input_ids=ids, attention_mask=mask)
emb = torch.nn.functional.normalize(out.community_output.encoded, dim=-1)
# emb is a [batch, 64] L2-normalized vector on the community manifold

Evaluation

Vs the prior SRT-adapter SOTA srt-adapter-v22c_a050:

benchmark zooL4nD3r-v0.1 srt-adapter-v22c_a050
corpus_v23 fine recall@1 (in-domain) 0.613 0.402
corpus_v23 fine NMI 0.857 0.783
corpus_v23 coarse ARI 0.413 0.163
MTEB RedditClusteringP2P.v2 0.500 0.488
MTEB Banking77Classification 0.109 0.306
MTEB-STS mean (out of scope) 0.219 0.364

v0.1 trades general STS for decisive in-domain community discrimination, which is the objective it was trained on.

Training summary

  • Loss: SupCon-style InfoNCE on community-cluster sibling pairs, k=4 negatives sampled cross-coarse, batch 16 / seq 128.
  • Optim: AdamW lr 3e-4 cosine, 200 warmup, 10 epochs (27,900 steps).
  • Hardware: 2x RTX PRO 6000 Blackwell DDP, ~6h22m wall.
  • Best checkpoint: step 8500 (~epoch 3, val recall@1 = 0.0212).

Limitations

  • Trained on English-only text. Cross-lingual community geometry transfers partially but is not a multilingual embedding.
  • Discourse manifold reflects the corpus_v23 domain mix (43% dialog, 33% QA, 24% scholarly). Out-of-domain inputs project somewhat arbitrarily.
  • Cluster labels (961 fine) are derived by Anthropic claude-sonnet-4-5 over HDBSCAN clusters and reflect that model's taxonomic priors.

Citation

@misc{zoolander-v0.1,
  title  = {zooL4nD3r v0.1: A community-discourse side-channel adapter for Qwen2.5-7B},
  author = {RiverRider},
  year   = {2026},
  url    = {https://huggingface.co/RiverRider/zooL4nD3r-v0.1}
}

Built on

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RiverRider/zooL4nD3r-v0.1

Base model

Qwen/Qwen2.5-7B
Finetuned
(884)
this model

Space using RiverRider/zooL4nD3r-v0.1 1