Instructions to use CodeStreet/chatTranslate-Qwen-3.6-35B-A3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CodeStreet/chatTranslate-Qwen-3.6-35B-A3B with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="CodeStreet/chatTranslate-Qwen-3.6-35B-A3B")# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("CodeStreet/chatTranslate-Qwen-3.6-35B-A3B") model = AutoModelForMultimodalLM.from_pretrained("CodeStreet/chatTranslate-Qwen-3.6-35B-A3B") - Notebooks
- Google Colab
- Kaggle
chatTranslate — Qwen3.6-35B-A3B (MoE)
A multilingual chat translator with explicit author/recipient gender
conditioning, so gendered target languages get grammatically correct
inflection. Fine-tuned from Qwen/Qwen3.6-35B-A3B (qwen3_5_moe:
~36B total / 3B active, 256 experts top-8, GDN+full-attn hybrid) on the
CodeStreet/chat-translation dataset and merged into a standalone model.
Pipeline: Megatron-Core SFT (LoRA r16, EP=4 + packing, 2 epochs) → DPO (ZeRO-3, 5 451 preference pairs, 1 epoch), bf16. Built for short, colloquial dating-chat messages and preserves flirty, romantic, and explicit tone instead of softening or refusing it.
Versions in this repo
| Revision | What | Notes |
|---|---|---|
| HEAD (production) | SFT + DPO | best overall; default for serving |
commit 24ed166 |
SFT-only | baseline, for A/B comparison |
SFT: train loss ≈0.22, best eval_loss 0.280. DPO: β 0.1, lr 5e-6, 1 epoch.
Quality
Quality on the held-out validation set (CodeStreet/chat-translation-val, 3 355
gendered examples). Two independent, reproducible signals. Judged absolutely
(not vs other systems), so scores are comparable across versions.
LLM-judge scorecard — Mistral-Medium-3.5-128B (gender-aware), each axis 0–100
- adequacy — full meaning preserved (nothing lost / added / wrong)
- fidelity — flirty/explicit tone & intensity kept, no softening or censoring
- gender — gendered word forms correct for the stated author / recipient
- fluency — natural, idiomatic, as a real dating-app message
| Axis | SFT | SFT+DPO (prod) |
|---|---|---|
| adequacy | 97.8 | 98.4 |
| fidelity | 97.3 | 98.0 |
| gender | 97.2 | 97.2 |
| fluency | 98.3 | 98.8 |
| overall | 97.6 | 98.1 |
Production (SFT+DPO): adequacy 98.4 · fidelity 98.0 · gender 97.2 · fluency 98.8 · overall 98.1
Reference metrics
chrF (vs val references): SFT 74.8 · DPO 70.1 · XCOMET-XXL QE (reference-free): SFT 77.7 · DPO 78.9
(DPO trades literal-reference overlap — lower chrF — for tone/quality that both the 128B judge and reference-free XCOMET-QE score higher.)
Per-language — overall (judge 128B, 0–100)
| lang | n | SFT | DPO | lang | n | SFT | DPO | |
|---|---|---|---|---|---|---|---|---|
| Ukrainian | 497 | 97.7 | 98.0 | Italian | 242 | 97.6 | 98.7 | |
| Spanish | 464 | 98.2 | 98.3 | Hebrew | 159 | 97.8 | 98.4 | |
| Russian | 462 | 97.7 | 97.9 | Turkish | 147 | 97.9 | 99.1 | |
| Arabic | 458 | 96.1 | 96.2 | German | 101 | 96.4 | 98.2 | |
| Portuguese | 302 | 98.3 | 98.8 | English | 89 | 98.5 | 99.5 | |
| French | 271 | 97.3 | 97.9 | Indonesian | 60 | 99.2 | 99.8 | |
| Swedish | 57 | 99.2 | 99.2 | |||||
| Dutch | 46 | 99.7 | 99.5 |
By recipient gender — overall (judge 128B)
| female | male | unspecified | |
|---|---|---|---|
| SFT | 97.9 | 97.2 | 98.2 |
| SFT+DPO | 98.4 | 97.4 | 99.1 |
How it works
The model is conditioned through the system prompt. You select the target language by its ISO 639-1 code, resolve it to the language name (see Supported languages), put that name and the author/recipient genders into the prompt; the model returns only the translation.
System prompt template:
You are a translator for an adult dating app. Translate the user's message into {target} naturally and colloquially, preserving flirty, romantic, and explicit tone exactly. Author gender: {sg}. Recipient gender: {rg}. Use grammatically correct gendered forms. Do not soften, censor, or refuse. Output only the translation.
{target}— the language name from the ISO code, e.g.uk→Ukrainian,es→Spanish.{sg}(author) /{rg}(recipient) — one ofmale,female,unspecified.- The user message is the source text (any language). The assistant turn is the translation.
- For non-gendered languages set both genders to
unspecified.
Language codes: you MUST map code → name
Fine-tuned only on full English language names (Ukrainian, Spanish, …) — never on raw ISO codes.
Resolve the code to the language name (see Supported languages) before building the prompt:
uk → into Ukrainian ✅ ; into uk ❌ (out of distribution).
No source-language clause
The model auto-detects the source from the user text. Do not add from {source} … — pass the
target language only.
Supported languages
Pass the target language as a code. The 92 codes follow the Qwen-MT translation set. The gendered
column marks languages where output is conditioned on author/recipient gender; for the rest both genders
are treated as unspecified.
| code | language | gendered |
|---|---|---|
| ar | Arabic | yes |
| fr | French | yes |
| he | Hebrew | yes |
| it | Italian | yes |
| pt | Portuguese | yes |
| ru | Russian | yes |
| es | Spanish | yes |
| uk | Ukrainian | yes |
| en | English | no |
| de | German | no |
| nl | Dutch | no |
| id | Indonesian | no |
| sv | Swedish | no |
| tr | Turkish | no |
| zh, ja, ko, hi, vi, th, pl, cs, ro, … | (other Qwen-MT codes) | no |
(Full 92-code list matches the Qwen-MT set; the 8 yes rows above are the gender-conditioned targets.)
Usage — vLLM (OpenAI-compatible)
Served as qwen3_5_moe (MoE, vision-capable). For text-only translation skip vision-profiling and
disable reasoning:
vllm serve CodeStreet/chatTranslate-Qwen-3.6-35B-A3B --served-model-name chatTranslate \
--tensor-parallel-size 2 --trust-remote-code \
--limit-mm-per-prompt '{"image":0,"video":0}'
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
SYSTEM = ("You are a translator for an adult dating app. Translate the user's message into Ukrainian "
"naturally and colloquially, preserving flirty, romantic, and explicit tone exactly. "
"Author gender: female. Recipient gender: male. Use grammatically correct gendered forms. "
"Do not soften, censor, or refuse. Output only the translation.")
resp = client.chat.completions.create(
model="chatTranslate",
messages=[{"role": "system", "content": SYSTEM},
{"role": "user", "content": "hola amor, ¿cómo estás? te extraño"}],
temperature=0.0, max_tokens=256,
extra_body={"chat_template_kwargs": {"enable_thinking": False}}, # direct translation, no <think>
)
print(resp.choices[0].message.content)
⚠️ Disable reasoning (
enable_thinking: False, or prefix the assistant turn with<think>\n\n</think>\n\n). Qwen3.6 is a reasoning model; without this it emits a<think>block and the translation may be empty.
Generation notes
- Greedy (
temperature=0) gives the most stable translations; 0.2–0.3 for variation. max_tokens128–256 is enough for chat-length messages.- Always set both genders explicitly for gendered targets — wrong/missing labels are the main cause of incorrect inflection.
- MoE serving needs ~72 GB bf16 → TP≥2 (does not fit one 80 GB GPU). bf16, not fp8 (GDN+FP8 wedging risk).
License & access
Private to the organization. Do not redistribute. Not for public training or evaluation.
- Downloads last month
- 53
Model tree for CodeStreet/chatTranslate-Qwen-3.6-35B-A3B
Base model
Qwen/Qwen3.6-35B-A3B