Sinhala E2R mT5

Fine-tuned mT5-base for Sinhala Easy-to-Read (E2R) text simplification. Stage 2 of a two-stage pipeline:

Input text
  ↓ Stage 1 — bert-base-multilingual-cased (Complex Word ID via MLM masking)
  ↓ Stage 2 — {HF_MODEL_REPO}  (structural simplification)
  ↓ E2R output

Training

  • Dataset : 800 Sinhala complex→simple sentence pairs
  • Best val_loss : 0.9057
  • E2R compliance : ~72.7% → ~74.1%
Downloads last month
39
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DineshaPriyadarshani/sinhala-e2r-mt5

Base model

google/mt5-base
Finetuned
(304)
this model