zooL4nD3r v0.1

zooL4nD3r-v0.1 is a 14.6M-parameter side-channel adapter for Qwen/Qwen2.5-7B. It reads hidden states from the frozen backbone and emits a 64-dimensional L2-normalized embedding that places each input on a learned community-discourse manifold (961 fine clusters / ~200 coarse).

It is not a sentence-similarity model. It is a community / discourse-mode classifier intended to run as a read-only side-channel during chat decoding.

Live demo

RiverRider/zooL4nD3r-demo - translate a passage across 961 communities (ZeroGPU).

Model details


Backbone (frozen)	`Qwen/Qwen2.5-7B`
Adapter params	14,562,627
Hidden dim	3584
Backbone layers	28
MAH heads at layers	7, 14, 21
Inject heads at layers	14, 21
Community head at layer	4
Output	64-d L2-normalized vector (`community_output.encoded`)
License	Apache-2.0
Training data	`RiverRider/zoolander-corpus-v23` (89k train, 4k val)

Intended use

Per-token community routing inside a frozen-LLM runtime (the primary use case; runtime is proprietary).
Document clustering / topic discovery over chat, dialog, philosophy, semiotics, and long-form QA corpora.
Cross-community paraphrase / steering substrate.

Out of scope

Sentence-pair similarity (use RiverRider/srt-adapter-v22c_a050 or mixedbread-ai/mxbai-embed-large-v1).
Short-text intent classification (Banking77-style).
Standalone retrieval over arbitrary domains.

Quick use

import torch
from huggingface_hub import hf_hub_download
from srt.adapter import SRTAdapter
from srt.config import SRTConfig

ckpt = hf_hub_download("RiverRider/zooL4nD3r-v0.1", "best_adapter.pt")
model = SRTAdapter(SRTConfig()).cuda().eval()
model.load_adapter(ckpt)

# tokenize with Qwen2.5-7B tokenizer, then:
out = model(input_ids=ids, attention_mask=mask)
emb = torch.nn.functional.normalize(out.community_output.encoded, dim=-1)
# emb is a [batch, 64] L2-normalized vector on the community manifold

Evaluation

Vs the prior SRT-adapter SOTA srt-adapter-v22c_a050:

benchmark	zooL4nD3r-v0.1	srt-adapter-v22c_a050
corpus_v23 fine recall@1 (in-domain)	0.613	0.402
corpus_v23 fine NMI	0.857	0.783
corpus_v23 coarse ARI	0.413	0.163
MTEB RedditClusteringP2P.v2	0.500	0.488
MTEB Banking77Classification	0.109	0.306
MTEB-STS mean (out of scope)	0.219	0.364

v0.1 trades general STS for decisive in-domain community discrimination, which is the objective it was trained on.

Training summary

Loss: SupCon-style InfoNCE on community-cluster sibling pairs, k=4 negatives sampled cross-coarse, batch 16 / seq 128.
Optim: AdamW lr 3e-4 cosine, 200 warmup, 10 epochs (27,900 steps).
Hardware: 2x RTX PRO 6000 Blackwell DDP, ~6h22m wall.
Best checkpoint: step 8500 (~epoch 3, val recall@1 = 0.0212).

Limitations

Trained on English-only text. Cross-lingual community geometry transfers partially but is not a multilingual embedding.
Discourse manifold reflects the corpus_v23 domain mix (43% dialog, 33% QA, 24% scholarly). Out-of-domain inputs project somewhat arbitrarily.
Cluster labels (961 fine) are derived by Anthropic claude-sonnet-4-5 over HDBSCAN clusters and reflect that model's taxonomic priors.

Citation

@misc{zoolander-v0.1,
  title  = {zooL4nD3r v0.1: A community-discourse side-channel adapter for Qwen2.5-7B},
  author = {RiverRider},
  year   = {2026},
  url    = {https://huggingface.co/RiverRider/zooL4nD3r-v0.1}
}

Built on

SRT-Adapter - adapter architecture and training code.
Prior SOTA in the same architecture family: RiverRider/srt-adapter-v22c_a050.

Downloads last month: 5

Model tree for RiverRider/zooL4nD3r-v0.1

Base model

Qwen/Qwen2.5-7B

Finetuned

(884)

this model

RiverRider
/

zooL4nD3r-v0.1