zooL4nD3r v0.1
zooL4nD3r-v0.1 is a 14.6M-parameter side-channel adapter for
Qwen/Qwen2.5-7B. It reads hidden states from the frozen backbone and emits
a 64-dimensional L2-normalized embedding that places each input on a learned
community-discourse manifold (961 fine clusters / ~200 coarse).
It is not a sentence-similarity model. It is a community / discourse-mode classifier intended to run as a read-only side-channel during chat decoding.
Live demo
RiverRider/zooL4nD3r-demo - translate a passage across 961 communities (ZeroGPU).
Model details
| Backbone (frozen) | Qwen/Qwen2.5-7B |
| Adapter params | 14,562,627 |
| Hidden dim | 3584 |
| Backbone layers | 28 |
| MAH heads at layers | 7, 14, 21 |
| Inject heads at layers | 14, 21 |
| Community head at layer | 4 |
| Output | 64-d L2-normalized vector (community_output.encoded) |
| License | Apache-2.0 |
| Training data | RiverRider/zoolander-corpus-v23 (89k train, 4k val) |
Intended use
- Per-token community routing inside a frozen-LLM runtime (the primary use case; runtime is proprietary).
- Document clustering / topic discovery over chat, dialog, philosophy, semiotics, and long-form QA corpora.
- Cross-community paraphrase / steering substrate.
Out of scope
- Sentence-pair similarity (use
RiverRider/srt-adapter-v22c_a050ormixedbread-ai/mxbai-embed-large-v1). - Short-text intent classification (Banking77-style).
- Standalone retrieval over arbitrary domains.
Quick use
import torch
from huggingface_hub import hf_hub_download
from srt.adapter import SRTAdapter
from srt.config import SRTConfig
ckpt = hf_hub_download("RiverRider/zooL4nD3r-v0.1", "best_adapter.pt")
model = SRTAdapter(SRTConfig()).cuda().eval()
model.load_adapter(ckpt)
# tokenize with Qwen2.5-7B tokenizer, then:
out = model(input_ids=ids, attention_mask=mask)
emb = torch.nn.functional.normalize(out.community_output.encoded, dim=-1)
# emb is a [batch, 64] L2-normalized vector on the community manifold
Evaluation
Vs the prior SRT-adapter SOTA srt-adapter-v22c_a050:
| benchmark | zooL4nD3r-v0.1 | srt-adapter-v22c_a050 |
|---|---|---|
| corpus_v23 fine recall@1 (in-domain) | 0.613 | 0.402 |
| corpus_v23 fine NMI | 0.857 | 0.783 |
| corpus_v23 coarse ARI | 0.413 | 0.163 |
| MTEB RedditClusteringP2P.v2 | 0.500 | 0.488 |
| MTEB Banking77Classification | 0.109 | 0.306 |
| MTEB-STS mean (out of scope) | 0.219 | 0.364 |
v0.1 trades general STS for decisive in-domain community discrimination, which is the objective it was trained on.
Training summary
- Loss: SupCon-style InfoNCE on community-cluster sibling pairs, k=4 negatives sampled cross-coarse, batch 16 / seq 128.
- Optim: AdamW lr 3e-4 cosine, 200 warmup, 10 epochs (27,900 steps).
- Hardware: 2x RTX PRO 6000 Blackwell DDP, ~6h22m wall.
- Best checkpoint: step 8500 (~epoch 3, val recall@1 = 0.0212).
Limitations
- Trained on English-only text. Cross-lingual community geometry transfers partially but is not a multilingual embedding.
- Discourse manifold reflects the
corpus_v23domain mix (43% dialog, 33% QA, 24% scholarly). Out-of-domain inputs project somewhat arbitrarily. - Cluster labels (961 fine) are derived by Anthropic
claude-sonnet-4-5over HDBSCAN clusters and reflect that model's taxonomic priors.
Citation
@misc{zoolander-v0.1,
title = {zooL4nD3r v0.1: A community-discourse side-channel adapter for Qwen2.5-7B},
author = {RiverRider},
year = {2026},
url = {https://huggingface.co/RiverRider/zooL4nD3r-v0.1}
}
Built on
- SRT-Adapter - adapter architecture and training code.
- Prior SOTA in the same architecture family:
RiverRider/srt-adapter-v22c_a050.
- Downloads last month
- 5
Model tree for RiverRider/zooL4nD3r-v0.1
Base model
Qwen/Qwen2.5-7B