docs(readme): cond-5 refined-extractor banner + Phase 3 staleness fixes

#9
by madiedgar - opened
Legesher org

Adds the same ## ⚠️ Phase 3 eval numbers — read the experiments repo before citing banner that's going on legesher/language-decoded-experiments, plus Phase 3 staleness fixes to the eval section.

Why the banner: this LoRA repo links readers at legesher/language-decoded-experiments for results. Those _summary_*.json files under-report cond-5 SIB-200 accuracy by 20–35pp; cite _summary_reparsed_*.json instead. See the experiments-repo banner for the full picture.

Staleness fixes applied alongside the banner:

  • Evaluation benchmarks table: added SIB-200 + Belebele (Phase 3) alongside MGSM/X-CSQA/XNLI (Phase 2+3)
  • Limitations: updated "Currently evaluated on 3 benchmarks" → 5 benchmarks, added an Extractor coverage bullet pointing at the refined-extractor writeup
  • Model Structure section: added a note that cond-5 is a cross-lingual evaluation pattern reusing cond-2 adapters (the cond-5 sub-directories live on the experiments dataset, not in this LoRA repo)
  • Experimental ladder: added the --> 5 step ("does shared script or language family create transfer effects when an adapter trained on one language is prompted in another?")

No adapter weights or configs touched. README-only change.

Refs: expedition-tiny-aya/analysis/phase-3/post-refined-action-items.md item E.

madiedgar changed pull request status to merged

Sign up or log in to comment