Qwen3-0.6B-EasyLanguage (4-bit, MLX)

LoRA fine-tune of Qwen/Qwen3-0.6B that rewrites live speech transcripts into easy-to-read registers across 20 language locales — German (Leichte Sprache), French (FALC), English (Easy/Plain English), Spanish (Lectura Fácil), Easy-to-Read Arabic (Inclusion Europe / Information for All), Letlæst (Inclusion Europe ETR), and more — for the Live Linguist on-device captioner. Each locale follows its own national / European Easy-to-Read or plain-language standard. Quantized to 4-bit for Apple-silicon inference via MLX.

It splits run-ons into short sentences, drops disfluencies, keeps names/numbers, and stays in the input language. Trained on a lean prompt (no few-shots) so the register is internalized — shorter prompts, lower live-caption latency.

Evaluation (held-out test set; SARI = simplification quality)

lang	SARI ft	SARI stock	chrF ft	LID ft	compliance ft
de	47.89	32.98	42.26	1.0	0.99
fr	56.93	33.31	51.19	1.0	1.0
es	59.34	36.57	54.79	1.0	0.995
en	59.17	32.46	54.57	1.0	1.0
ar	49.23	50.11	45.74	0.985	0.94
da	53.12	35.58	50.24	0.995	0.94
et	49.39	30.48	52.76	1.0	1.0
fi	50.55	35.2	56.33	1.0	0.985
hi	52.66	35.16	45.05	0.995	0.82
it	53.34	49.18	48.83	1.0	0.8299
ja	8.37	8.29	40.85	1.0	0.97
ko	49.55	43.77	40.03	1.0	0.98
nl	51.9	38.96	48.38	0.985	0.9
pt-BR	55.02	37.54	52.11	1.0	0.82
pt-PT	57.44	38.96	58.14	1.0	0.975
ru	50.61	41.13	49.65	1.0	0.94
sk	50.92	40.59	46.68	1.0	0.975
sv	53.23	40.16	55.08	0.995	0.96
tr	51.32	31.73	49.91	0.995	0.97
vi	59.62	41.3	58.75	1.0	0.725
zh-CN	8.75	8.5	51.67	0.955	0.925

Usage (MLX)

from mlx_lm import load, generate
model, tok = load("ndgold/Qwen3-0.6B-EasyLanguage-4bit")
msgs = [{"role":"system","content":"<framework system prompt>"},
        {"role":"user","content":"Original: <utterance>\nRewritten:"}]
p = tok.apply_chat_template(msgs, add_generation_prompt=True, tokenize=False, enable_thinking=False)
print(generate(model, tok, prompt=p, max_tokens=96))

Sources & licenses

Base model: Qwen3 (Apache-2.0).
German seed data: tum-nlp/German4All-Corpus (German Wikipedia, CC BY-SA).
Synthetic pairs (all other languages + de augmentation): spoken→easy-language pairs generated by Claude (Anthropic) — Leichte Sprache (de), FALC (fr), Easy/Plain English (en), Lectura Fácil (es), and 25 further locales following each language's Easy-to-Read / plain-language standard (Inclusion Europe "Information for All", Selkokieli, Lättläst, やさしい日本語, ISO 24495-1, …). Every pair is filtered by a deterministic validator suite (per-language sentence-length caps, language-ID, fidelity anchoring, number preservation, anti-parroting).
Intended for the Live Linguist on-device live-caption simplifier. Not a general chatbot.

Downloads last month: 61

Safetensors

Model size

93.2M params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ndgold/Qwen3-0.6B-EasyLanguage-4bit

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Quantized

(333)

this model