Qwen3-1.7B LoRA for Romanian Diacritic Restoration
LoRA adapters for Qwen3-1.7B fine-tuned on Romanian diacritic restoration.
Results (CRAWLER-1000 clean)
| Metric | Value |
|---|---|
| Word Accuracy | 48.99% |
| DER | 0.545 |
| Speed | 0.28 sent/s |
LoRA improves Qwen3-1.7B from 0.37% (prompting) to 48.99% (130x improvement), but still below supervised baselines (BiLSTM: 96.23%).
Noise robustness: Lowest relative degradation (23.2%), comparable to dictionary lookup (25.0%).
Training
- Base: Qwen/Qwen3-1.7B, LoRA rank 16, alpha 32
- 5,000 iterations, batch size 4, lr=2e-4
- Hardware: Apple M3 Ultra (MLX)
- Prompt:
Restore diacritics: {input}
Resources
- Dataset: klusai/diacritics-ro
- Code: github.com/klusai/diacritics-finetuning-code